FedGNN: Federated Graph Neural Network for Privacy-Preserving Recommendation
FFedGNN: Federated Graph Neural Network forPrivacy-Preserving Recommendation
Chuhan Wu , Fangzhao Wu , Yang Cao , Yongfeng Huang , Xing Xie Tsinghua University, Beijing 100084, China Microsoft Research Asia, Beijing 100080, China, Kyoto University, Kyoto 615-8558, Japan{wuchuhan15,wufangzhao}@gmail.com,[email protected],[email protected],[email protected]
ABSTRACT
Graph neural network (GNN) is widely used for recommendationto model high-order interactions between users and items. ExistingGNN-based recommendation methods rely on centralized storageof user-item graphs and centralized model learning. However, userdata is privacy-sensitive, and the centralized storage of user-itemgraphs may arouse privacy concerns and risk. In this paper, wepropose a federated framework for privacy-preserving GNN-basedrecommendation, which can collectively train GNN models fromdecentralized user data and meanwhile exploit high-order user-item interaction information with privacy well protected. In ourmethod, we locally train GNN model in each user client based onthe user-item graph inferred from the local user-item interactiondata. Each client uploads the local gradients of GNN to a serverfor aggregation, which are further sent to user clients for updatinglocal GNN models. Since local gradients may contain private infor-mation, we apply local differential privacy techniques to the localgradients to protect user privacy. In addition, in order to protect theitems that users have interactions with, we propose to incorporaterandomly sampled items as pseudo interacted items for anonymity.To incorporate high-order user-item interactions, we propose auser-item graph expansion method that can find neighboring userswith co-interacted items and exchange their embeddings for ex-panding the local user-item graphs in a privacy-preserving way.Extensive experiments on six benchmark datasets validate that ourapproach can achieve competitive results with existing centralizedGNN-based recommendation methods and meanwhile effectivelyprotect user privacy.
KEYWORDS
Personalized recommendation, Graph neural network, Privacy-preserving, Federated learning
ACM Reference Format:
Chuhan Wu , Fangzhao Wu , Yang Cao , Yongfeng Huang , Xing Xie .2021. FedGNN: Federated Graph Neural Network for Privacy-PreservingRecommendation. In Proceedings of ACM SIGKDD Conference on KnowledgeDiscovery and Data Mining (KDD 2021),
Jennifer B. Sartor, Theo D’Hondt,and Wolfgang De Meuter (Eds.). ACM, New York, NY, USA, Article 4, 9 pages.https://doi.org/10.475/123_4
Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).
KDD 2021, August 2021, Singapore © 2021 Copyright held by the owner/author(s).ACM ISBN 123-4567-24-567/08/06.https://doi.org/10.475/123_4 ...... ...... Local User-Item Graph ...
Centralized User-Item Graph Local GNNGlobal GNN ...
Local User-Item Graph GlobalGNN
Server User DeviceUser Device
Local GNN (a) Centralized learning. ...... ......... Centralized User-Item Graph Local GNNGlobal GNN ...
Local User-Item Graph GlobalGNN
Server
Local
User-Item
Graph
Client NLocal User-Item Graph
Local GNN
Client 1 (b) Decentralized learning.
Figure 1: Comparisons between centralized and decentral-ized training of GNN based recommendation models.
Graph neural network (GNN) is widely used by many personal-ized recommendation methods in recent years [12, 31, 36], sinceit can capture high-order interactions between users and itemson the user-item graph to enhance the user and item representa-tions [2, 37, 38]. For example, Berg et al. [2] proposed to use graphconvolutional autoencoders to learn user and item representationsfrom the user-item bipartite graph. Wang et al. [31] proposed touse a three-hop graph attention network to capture the high-orderinteractions between users and items. These existing GNN-basedrecommendation methods usually necessitate centralized storageof the entire user-item graph to learn GNN models and the rep-resentations of users and items, which means that the user-iteminteraction data needs to be centrally stored, as shown in Fig. 1(a).However, user-item interaction data is highly privacy-sensitive, andits centralized storage can lead to the privacy concerns of users andthe risk of data leakage [24]. Moreover, under the pressure of strictdata protection regulations such as GDPR , online platforms maynot be able to centrally store user-item interaction data to learnGNN models for recommendation in the future.An intuitive way to tackle the privacy issue of user-item interac-tion data is to locally store the raw data on user devices and learnlocal GNN models based on it, as shown in Fig. 1(b). However, inthis scenario, it is very difficult to train an accurate GNN model forrecommendation due to following reasons. First, for most users thevolume of interaction data on their devices is too small to locally https://gdpr-info.eu a r X i v : . [ c s . I R ] M a r rain accurate GNN models. Thus, a unified framework that coordi-nates a large number of user clients to collectively learn an accurateglobal GNN model from decentralized user data is required. Second,the local GNN model trained on local user data may convey privateinformation, and it is challenging to protect user privacy whensynthesizing the global GNN model from the local ones. Third, thelocal user data only contains first-order user-item interactions, andusers’ interaction items cannot be directly exchanged due to privacyrestrictions. Thus, it is very challenging to exploit the high-orderuser-item interactions without privacy leakage.In this paper, we propose a federated framework named FedGNN for privacy-preserving GNN-based recommendation, which caneffectively exploit high-order user-item interaction information bycollectively training GNN models for recommendation in a privacy-preserving way. Since user interaction data is highly decentralized,there is no global user-item graph. Thus, in our method each userdevice locally learns a GNN model and the embeddings of user anditems based on the user-item graph inferred from the local user-item interaction data on this device. The user devices compute thegradients of models and user/item embeddings and upload themto a central server, which aggregates the gradients from a numberof users and distributes them to user devices for local updates.However, both the items with non-zero gradients and the GNNmodel gradients contain private information. Thus, we proposea privacy-preserving model update method to protect user-iteminteraction data without locally memorizing the full item set duringmodel training. More specifically, we apply local differential privacy(LDP) techniques to the local gradients computed by user clients toprotect user privacy. In addition, in order to protect the real itemsthat user interacted with when uploading the gradients of itemembeddings, we generate random embedding gradients of a certainnumber of randomly sampled pseudo interacted items. Besides,to exploit high-order information of the user-item graph withoutleaking user privacy, we propose a privacy-preserving user-itemgraph expansion method that aims to find the neighbors of userswith co-interacted items and exchange their embeddings to expandtheir local user-item graph. In this way, high-order information ofthe user-item graph can be exploited by the GNN model to enhanceuser and item representations, and the private user-item interactiondata do not leak. We conduct massive experiments on six widelyused benchmark datasets for recommendation, and the results showthat our approach can achieve competitive results with existingcentralized GNN-based recommendation methods and meanwhileeffectively protect user privacy.The major contributions of this paper are summarized as follows: • We propose a novel federated framework for privacy-preservingGNN-based recommendation that can exploit highly decen-tralized user data to collectively train GNN models. • We propose to protect model gradients in model trainingwith local differential privacy and propose a pseudo inter-acted item sampling technique to protect the items that usershave interactions with. • We propose a privacy-preserving user-item graph expansionmethod to exploit high-order user-item interactions fromdecentralized user data. • Extensive experiments and analysis on six benchmark datasetsshow that our approach can achieve competitive results withexisting centralized GNN-based recommendation methodsand meanwhile protect user privacy.
Graph neural networks are preferred by many recommendationmethods to model high-order relations between users and items [2,5, 7, 10, 27, 30–34, 36, 37]. For example, Berg et al. [2] proposed agraph convolutional matrix completion (GC-MC) approach. GC-MCuses a graph convolutional encoder to learn user and item repre-sentations from the user-item bipartite graph, and then predictsunknown ratings via a bilinear decoder. Ying et al. [36] proposeda graph convolutional neural network based method for recom-mendation named PinSage. It learns item representations from anitem-item graph via 2-hop graph convolutions, and uses these repre-sentations in downstream recommendation tasks. Wang et al. [31]proposed a neural graph collaborative filtering (NGCF) approachthat uses 3-hop graph neural networks to learn user and item em-beddings from the user-item bipartite graph. Besides the user-itemgraph, several GNN-based recommendation methods also incor-porate other kinds of graphs into recommendation, such as user-item-entity graph [29] and user-user-item graph [5]. For example,Wang et al. [30] proposed a knowledge-graph enhanced recommen-dation approach based on Knowledge Graph Attention Network(KGAT). They use a 3-hop graph-attention network to learn user,item and entity representations from a heterogeneous graph, whichis formed by linking entities in knowledge graphs with items in theuser-item graph. Fan et al. [5] proposed a social recommendationapproach named GraphRec. They use graph attention networks tolearn user and item embeddings from the user-item bipartite graphand the user-user social graph. However, these methods need cen-tralized storage of users’ interactions with items to form the entireuser-item graph, which may arouse users’ privacy concerns and therisk of private data leakage. Different from them, in our
FedGNN method the raw user data never leaves the local user devices. Inaddition,
FedGNN leverages a privacy-preserving model updatemethod to protect private gradients and a privacy-preserving user-item graph expansion method to incorporate high-order user-iteminteractions. Thus,
FedGNN can employ GNN models for graspinghigh-order information in a privacy-preserving way.
Federated learning is a machine learning technique to collectivelylearn intelligent models based on decentralized user data in aprivacy-preserving manner [14, 17]. Different from existing ma-chine learning methods based on centralized storage of user data, infederated learning the user data is kept locally on user devices [35].Each device maintains a local model and computes local modelupdates based on the user data stored on this device. The localmodel updates from a number of users are uploaded to a centralserver that coordinates the model training process. These updatesare aggregated into a unified one to update the global model main-tained by this server. The updated model is further distributed to able 1: Comparison of different methods in terms of high-order user-item interaction modeling and privacy protection. “Cen.”and “Local” represent centralized and decentralized data storage, respectively.
PMF SVD++ GRALS sRGCNN GC-MC PinSage NGCF FCF FedMF FedGNNHigh-order user-item interaction × ✓ ✓ ✓ ✓ ✓ ✓ × × ✓ Rating protection × × × × × × × ✓ ✓ ✓
Interaction item protection × × × × × × × × × ✓ User data storage Cen. Cen. Cen. Cen. Cen. Cen. Cen. Local Local Local all user devices to update the local models. This process is itera-tively executed until the model converges. Since the model updatesusually contain much less private information and the raw userdata never leaves the devices, the risk of privacy leakage can beeffectively reduced [8].The framework of federated learning has been applied to person-alized recommendation [1, 3, 6, 9, 22]. For example, Ammad et al. [1]proposed a federated collaborative filtering (FCF) approach. In FCF,each user device locally computes the gradients of the user anditem embeddings based on the personal ratings stored on this de-vice. The user embeddings are locally updated, and the gradientsof item embeddings are uploaded to a central server. The serveraggregates the item gradients from massive clients to update theglobal item embeddings it maintains. The updated item embeddingsare further distributed to user clients for local embedding updates.However, in this method the gradients of item embedding may leaksome information on the private ratings [39]. To solve this problem,Chai et al. [3] proposed a federated matrix factorization (FedMF)method, where the item embeddings are protected by homomorphicencryption techniques. However, these methods do not considerthe high-order interactions between users and items, which maynot be optimal in learning accurate user and item representations.In addition, they mainly focus on protecting the private ratingsgiven by users and cannot protect the raw user-item interactiondata unless they locally maintain the full item set on each device,which is impractical due to the heavy storage and communica-tion costs. Different from these methods, our approach can capturehigh-order interactions between users and items by our proposedprivacy-preserving user-item graph expansion method. In addition,our method can protect the raw user-item interaction data duringthe model training process in an effective and efficient way. To bet-ter demonstrate the advantage of our approach, we summarize thecomparison between
FedGNN and existing methods on exploitinghigh-order user-item interactions and privacy protection in Table 1.
In this section, we first present the problem definitions in our fed-erated GNN-based recommendation framework (FedGNN), then in-troduce the details of our
FedGNN approach for privacy-preservingrecommendation, and finally provide some discussions and analysison privacy protection.
Denote U = { 𝑢 , 𝑢 , ..., 𝑢 𝑃 } and T = { 𝑡 , 𝑡 , ..., 𝑡 𝑄 } as the sets ofusers and items respectively, where 𝑃 is the number of users and 𝑄 is the number of items. Denote the rating matrix between users and items as Y ∈ R 𝑃 × 𝑄 , which is used to form a bipartite user-item graph G based on the observed ratings Y 𝑜 . We assume thatthe user 𝑢 𝑖 has interactions with 𝐾 items, which are denoted by [ 𝑡 𝑖, , 𝑡 𝑖, , ..., 𝑡 𝑖,𝐾 ] . These items and the user 𝑢 𝑖 can form a first-orderlocal user-item subgraph G 𝑖 . The ratings that given to these itemsby user 𝑢 𝑖 are denoted by [ 𝑦 𝑖, , 𝑦 𝑖, , ..., 𝑦 𝑖,𝐾 ] . To protect user privacy(both the private ratings and the items a user has interactions with),each user device locally keeps the interaction data of this user, andthe raw data never leaves the user device. We aim to predict theunobserved ratings ( 𝑦 ∈ Y \ Y 𝑜 ) based on the interaction data G 𝑖 locally stored on user devices in a privacy-preserving way. Notethat there is no global user-item interaction graph in our approachand local graphs are built and stored in different device, which isvery different from existing federated GNN methods [11, 18] thatrequire the entire graph is built and stored together in at least oneplatform or device. Next, we introduce the framework of our
FedGNN method forprivacy-preserving GNN-based recommendation. It can leveragethe highly decentralized user interaction data to learn GNN modelsfor recommendation by exploiting the high-order user-item inter-actions in a privacy-preserving way. The framework of
FedGNN isshown in Fig. 2. It mainly consists of a central server and a largenumber of user clients. The user client keeps a local subgraph thatconsists of the user interaction histories with items and the neigh-bors of this user. Each client learns the user/item embeddings andthe GNN models from its local subgraph, and uploads the gradientsto a central server. The central server is responsible for coordinatingthese user clients in the model learning process by aggregating thegradients received from a number of user clients and delivering theaggregated gradients to them. Next, we introduce how they workin detail.The local subgraph on each user client is constructed from theuser-item interaction data and the neighboring users that have co-interacted items with this user. The node of this user is connectedto the nodes of the items she interacted with, and the node of herneighboring users. An embedding layer is first used to convert theuser node 𝑢 𝑖 , the 𝐾 item nodes [ 𝑡 𝑖, , 𝑡 𝑖, , ..., 𝑡 𝑖,𝐾 ] and the 𝑁 neigh-boring user nodes [ 𝑢 𝑖, , 𝑢 𝑖, , ..., 𝑢 𝑖,𝑁 ] into their embeddings, whichare denoted as e 𝑢𝑖 , [ e 𝑡𝑖, , e 𝑡𝑖, , ..., e 𝑡𝑖,𝐾 ] and [ e 𝑢𝑖, , e 𝑢𝑖, , ..., e 𝑢𝑖,𝑁 ] , respec-tively. Since the user embeddings may not be accurate enough whenthe model is not well-tuned, we first exclude the neighboring userembeddings in the model learning for 𝑇 epochs, and then incorpo-rate them into model learning when they have been tuned. Notethat the embeddings of the user 𝑢 𝑖 and the item embeddings are gg r ega t o r ...... Lo ss E m bedd i ng ... G NN R a t i ng P r ed i c t o r ... ModelGradients EmbeddingGradients L D P 𝑢𝑢 𝑖𝑖 𝑡𝑡 𝑖𝑖 , 𝑡𝑡 𝑖𝑖 , 𝐾𝐾 𝑡𝑡𝑡 𝑖𝑖 , 𝑀𝑀 𝑡𝑡𝑡 𝑖𝑖 , 𝒉𝒉 𝑖𝑖𝑢𝑢 𝒉𝒉 𝑖𝑖 , 𝐾𝐾𝑡𝑡 �𝑦𝑦 𝑖𝑖 , �𝑦𝑦 𝑖𝑖 , �𝑦𝑦 𝑖𝑖 , 𝐾𝐾 𝑦𝑦 𝑖𝑖 , 𝑦𝑦 𝑖𝑖 , 𝑦𝑦 𝑖𝑖 , 𝐾𝐾 𝒆𝒆 𝑖𝑖 , 𝒆𝒆 𝑖𝑖 , 𝑀𝑀𝑝𝑝 ...... Lo ss E m bedd i ng ... G NN R a t i ng P r ed i c t o r ModelGradients EmbeddingGradients L D P 𝑡𝑡𝑡 𝑗𝑗 , 𝑀𝑀 𝑡𝑡𝑡 𝑗𝑗 , �𝑦𝑦 𝑗𝑗 , 𝐾𝐾 �𝑦𝑦 𝑗𝑗 , �𝑦𝑦 𝑗𝑗 , 𝑦𝑦 𝑗𝑗 , 𝐾𝐾 𝑦𝑦 𝑗𝑗 , 𝑦𝑦 𝑗𝑗 , ModelGradientsEmbeddingGradients
Neighbor UsersPseudoInteractedItemsPseudoInteractedItems
User jUser i ServerDistributeDistribute 𝑢𝑢 𝑖𝑖 , 𝑁𝑁 𝑢𝑢 𝑖𝑖 , ... ... 𝒉𝒉 𝑖𝑖 , 𝒉𝒉 𝑖𝑖 , 𝑁𝑁𝑢𝑢 𝒉𝒉 𝑖𝑖 , ... 𝒆𝒆 𝑖𝑖𝑢𝑢 𝒆𝒆 𝑖𝑖 , 𝐾𝐾𝑡𝑡 ... 𝒆𝒆 𝑖𝑖 , 𝒆𝒆 𝑖𝑖 , 𝑁𝑁𝑢𝑢 𝒆𝒆 𝑖𝑖 , ... 𝒉𝒉 𝑗𝑗𝑢𝑢 𝒉𝒉 𝑗𝑗 , 𝐾𝐾𝑡𝑡 𝒆𝒆 𝑗𝑗 , 𝒆𝒆 𝑗𝑗 , 𝑀𝑀𝑝𝑝 ... 𝒉𝒉 𝑗𝑗 , 𝒉𝒉 𝑗𝑗 , 𝑁𝑁𝑢𝑢 𝒉𝒉 𝑗𝑗 , ... 𝒆𝒆 𝑗𝑗𝑢𝑢 𝒆𝒆 𝑗𝑗 , 𝐾𝐾𝑡𝑡 ... 𝒆𝒆 𝑗𝑗 , 𝒆𝒆 𝑗𝑗 , 𝑁𝑁𝑢𝑢 𝒆𝒆 𝑗𝑗 , ...... ... Upload...
Real Interacted Items 𝑢𝑢 𝑗𝑗 𝑡𝑡 𝑗𝑗 , 𝑡𝑡 𝑗𝑗 , 𝐾𝐾 𝑢𝑢 𝑗𝑗 , 𝑁𝑁 𝑢𝑢 𝑗𝑗 , ...... Neighbor UsersReal Interacted Items
Figure 2: The framework of our
FedGNN approach. can be locally tuned during model training, while the embeddingsof neighboring users are fixed. Next, we apply a graph neural network to these embeddings tomodel the interactions between nodes on the local first-order sub-graph. Various kinds of GNN network can be used in our framework,such as graph convolution network (GCN) [13], gated graph neuralnetwork (GGNN) [16] and graph attention network (GAT) [28].The GNN model outputs the hidden representations of the userand item nodes, which are denoted as h 𝑢𝑖 , [ h 𝑡𝑖, , h 𝑡𝑖, , ..., h 𝑡𝑖,𝐾 ] and [ h 𝑢𝑖, , h 𝑢𝑖, , ..., h 𝑡𝑖,𝑁 ] , respectively. Then, a rating predictor module isused to predict the ratings given by the user 𝑢 𝑖 to her interacteditems (denoted by [ ˆ 𝑦 𝑖, , ˆ 𝑦 𝑖, , ..., ˆ 𝑦 𝑖,𝐾 ] ) based on the embeddings ofitems and this user. These predicted ratings are compared againstthe gold ratings locally stored on the user device to compute theloss function. For the user 𝑢 𝑖 , the loss function L 𝑖 is computed as L 𝑖 = 𝐾 (cid:205) 𝐾𝑗 = | ˆ 𝑦 𝑖,𝑗 − 𝑦 𝑖,𝑗 | . We use the loss L 𝑖 to derive the gradientsof the models and embeddings, which are denoted by g 𝑚𝑖 and g 𝑒𝑖 ,respectively. These gradients will be further uploaded to the serverfor aggregation.The server aims to coordinate all user devices and compute theglobal gradients to update the model and embedding parameters inthese devices. In each round, the server awakes a certain numberof user clients to compute gradients locally and send them to the We find this method slightly outperforms using trainable neighboring user embed-dings (shown in experiments). Thus, we prefer fixed ones to reduce computational andcommunicational costs of model training. server. After the server receiving the gradients from these users,the aggregator in this server will aggregate these local gradientsinto a unified one g . Then, the server sends the aggregated gra-dients to each client to conduct local parameter update. Denotethe parameter set in the 𝑖 -th user device as Θ 𝑖 . It is updated by Θ 𝑖 = Θ 𝑖 − 𝛼 g , where 𝛼 is the learning rate. This process will beiteratively executed until the model converges. We summarize theframework of our FedGNN method in Algorithm 1. We will thenintroduce two modules for privacy protection in FedGNN, i.e., aprivacy-preserving model update module (corresponding to Lines9-11 in Algorithm 1) for protecting gradients in the model updateand a privacy-preserving user-item graph expansion module (cor-responding to Line 15 in Algorithm 1) to protect user privacy whenmodeling high-order user-item interactions.
If we directly upload the GNN model and item embedding gra-dients, then there may be some privacy issues due to followingreasons. First, for embedding gradients, only the items that a userhas interactions with have non-zero gradients to update their em-beddings, and the server can directly recover the full user-iteminteraction history based on the non-zero item embedding gradi-ents. Second, besides the embedding gradients, the gradients of the We use the FedAvg [17] algorithm to implement the aggregator. Only sends the gradients of the model and the corresponding user and item (includingpseudo interacted ones) embeddings. lgorithm 1
FedGNN Each client constructs its local subgraph G 𝑖 Initialize Θ 𝑖 on each user client using the same seed Iteration count 𝑐 ← Graph expansion switch 𝑠 ← 𝑇𝑟𝑢𝑒 // Server repeat Select a subset S from the user set U randomly g = for each user client 𝑢 𝑖 ∈ S do if 𝑐 |S| < 𝑇 · 𝑃 then g ← g + LocalGradCal (G 𝑖 , 𝐹𝑎𝑙𝑠𝑒 ) else g ← g + LocalGradCal (G 𝑖 ,𝑇𝑟𝑢𝑒 ) end if end for if 𝑐 |S| ≥ 𝑇 · 𝑃 and 𝑠 then PrivacyPreservingGraphExpansion () 𝑠 ← 𝐹𝑎𝑙𝑠𝑒 end if g ← g /|S| Distribute g to user clients for local update until model convergence// User Client LocalGradCal ( 𝑖 , includeNeighbor): Select a mini-batch of data N from G 𝑖 if includeNeighbor then Use neighboring user embeddings end if
Compute GNN model gradients g 𝑚𝑖 and embedding gradients g 𝑒𝑖 on N g 𝑖 ← ( g 𝑚𝑖 , g 𝑒𝑖 ) return g 𝑖 GNN model and rating predictor may also leak private informationof user histories and ratings [39], because the GNN model gradientsencode the preferences of users on items. In existing methods suchas FedMF [3], homomorphic encryption techniques are applied togradients to protect private ratings. However, in this method theuser device needs to locally memorize the embedding table of theentire item set T and upload it in every iteration to achieve userinteraction history protection, which is impractical due to the hugestorage and communication costs during model training.To tackle these challenges, we propose two strategies to protectuser privacy in the model update process. The first one is pseudointeracted item sampling. Concretely, we sample 𝑀 items that theuser has not interacted with , and randomly generate their gra-dients g 𝑝𝑖 using a Gaussian distribution with the same mean andco-variance values with the real item embedding gradients. Thereal embedding gradients g 𝑒𝑖 are combined with the pseudo itemembedding gradients g 𝑝𝑖 , and the unified gradient of the model and There are many sampling methods such as using the displayed items that have nointeraction with a user. In our experiments we randomly sample items from the fullitem set for simulation. In addition, 𝑀 needs to be larger than 𝐾 to protect user privacy. User-Item Subgraph
Third-PartyServer H o m o m o r ph i c E n c r y p t i on 𝑡𝑡 𝑖𝑖 , 𝑡𝑡 𝑖𝑖 , 𝐾𝐾 ... It e m M a t c he r H o m o m o r ph i c E n c r y p t i on 𝑡𝑡 𝑗𝑗 , 𝐾𝐾 𝑡𝑡 𝑗𝑗 , ... ① Public Key ① Public Key 𝒆𝒆 𝑖𝑖 , 𝑁𝑁𝒖𝒖 𝒆𝒆 𝑖𝑖 , ... ③ Anonymous Neighbor User EmbeddingItem IDsItem IDs 𝒆𝒆 𝑖𝑖𝑢𝑢 User Embedding N e i ghbo r D i s t r i bu t o r 𝒆𝒆 𝑗𝑗𝑢𝑢 User Embedding ... 𝒆𝒆 𝑗𝑗 , 𝑁𝑁𝒖𝒖 𝒆𝒆 𝑗𝑗 , ③ Anonymous Neighbor User Embedding ... ②②②②
Server
Expand
DistributeDistributeUpload 𝑡𝑡 𝑖𝑖 , 𝑡𝑡 𝑖𝑖 , 𝐾𝐾 ... 𝑢𝑢 𝑖𝑖 User-Item Subgraph 𝑡𝑡 𝑗𝑗 , ... 𝑢𝑢 𝑗𝑗 𝑡𝑡 𝑗𝑗 , 𝐾𝐾 Figure 3: The framework of the privacy-preserving user-item graph expansion method. embeddings on the 𝑖 -th user device (Line 26 in Algorithm 1) is modi-fied as g 𝑖 = ( g 𝑚𝑖 , g 𝑒𝑖 , g 𝑝𝑖 ) . The second one is local differential privacy.Following [21], we clip the local gradients on user clients based ontheir L ∞ -norm with a threshold 𝛿 , and apply a local differentialprivacy (LDP) [4] module with zero-mean Laplacian noise to theunified gradients to achieve better user privacy protection, whichare formulated as follows: g 𝑖 = 𝑐𝑙𝑖𝑝 ( g 𝑖 , 𝛿 ) + 𝐿𝑎𝑝𝑙𝑎𝑐𝑒 ( , 𝜆 ) , (1)where 𝜆 is the strength of Laplacian noise. The protected gradients g 𝑖 are uploaded to the server for aggregation. Then, we introduce our privacy-preserving user-item graph expan-sion method that aims to find the neighbors of users and extendthe local user-item graphs in a privacy-preserving way. In existingGNN-based recommendation method based on centralized graphstorage, high-order user-item interactions can be directly derivedfrom the global user-item graph. However, when user data is decen-tralized, it is a non-trivial task to incorporate high-order user-iteminteractions without violating user privacy protection. To solvethis problem, we propose a privacy-preserving user-item graphexpansion method that finds the anonymous neighbors of users toenhance user and item representation learning, where user privacydoes not leak. Its framework is shown in Fig. 3. We introduce itsdetails as follows.The central server that maintains the recommendation servicesfirst generates a public key, and then distributes it to all user clientsfor encryption. After receiving the public key, each user deviceapplies homomorphic encryption [3] to the IDs of the items she The privacy budget 𝜖 can be bounded by 𝛿𝜆 . lgorithm 2 privacy-preserving user-item graph expansion PrivacyPreservingGraphExpansion (): Server sends a public key 𝑝 to user clients User clients encrypt item IDs with 𝑝 User clients upload the user embedding and encrypted itemIDs to a third-party server Third-party server distributes neighboring user embeddings touser clients User clients extend local graphsinteracted based on this key because the IDs of these items areprivacy-sensitive. The encrypted item IDs as well as the embeddingof this user are uploaded to a third-party server (do not necessarilyto be trusted). This server finds the users who interacted with thesame items via item matching, and then provides each user with theembeddings of her anonymous neighbors. In this stage, the serverfor recommendation never receives the private information of users,and the third-party server cannot obtain any private information ofusers and items since it cannot decrypt the item IDs. We connecteach user node with its neighboring user nodes. In this way, thelocal user-item graph can be enriched by the high-order user-iteminteractions without harming the protection of user privacy. Wesummarize the process of our privacy-preserving user-item graphexpansion method in Algorithm 2.
The user privacy is protected by four aspects in our
FedGNN ap-proach. First, in
FedGNN the recommendation server never collectsraw user-item interaction data, and only local computed gradientsare uploaded to this server. Based on the data processing inequal-ity [17], we can infer that these gradients contain much less pri-vate information than the raw user interaction data. Second, thethird-party server also cannot infer private information from theencrypted item IDs since it cannot obtain the private key. However,if the recommendation server colludes with the third-party serverby exchanging the private key and item table, the user interactionhistory will not be protected. Fortunately, the private ratings canstill be protected by our privacy-preserving model update method.Third, in
FedGNN we propose a pseudo interacted item samplingmethod to protect the real interacted items by sampling a numberof items that are not interacted with a user. Since gradients of bothkinds of items have the same mean and co-variance values, it isdifficult to discriminate the real interacted items from the pseudoones if the number of pseudo interacted items is sufficiently largerthan the number of real interacted items. The average degree ofprivacy protection is proportional to 1 + 𝑀𝑃 | 𝑌 𝑜 | [26]. Thus, the numberof pseudo interacted items can be relatively larger to achieve betterprivacy protection as long as the computation resources of userdevices permit. Fourth, we apply the LDP technique to the gradientslocally computed by the user device, making it more difficult torecover the raw user consumption history from these gradients. It We choose homomorphic encryption because the server cannot match the itemshashed by many other salted encryption methods. We assume that they do not collude with each other. The neighboring user nodes are not connected to the co-interacted items for betteruser privacy protection under Byzantine attack. is shown in [21] that the upper bound of the privacy budget 𝜖 is 𝛿𝜆 , which means that we can achieve a smaller privacy budget 𝜖 byusing a smaller clipping threshold 𝛿 or a larger noise strength 𝜆 . However, the accuracy of model gradients will also be affected ifthe privacy budget is too small. Thus, we need to properly chooseboth hyperparameters to balance model performance and privacyprotection.
In our experiments, following [2] we use six widely used bench-mark datasets for recommendation, including MovieLens (100K,1M, and 10M), Flixster, Douban, and YahooMusic. We use the pre-processed subsets of the Flixster, Douban, and YahooMusic datasetsprovided by [20]. We denote the three versions of MovieLensas ML-100K, ML-1M and ML-10M respectively, and we denote Ya-hooMusic as Yahoo. The detailed statistics of these datasets aresummarized in Table 2.
Table 2: Statistics of the datasets.
Dataset
Flixster 3,000 3,000 26,173 0.5,1,...,5Douban 3,000 3,000 136,891 1,2,...,5Yahoo 3,000 3,000 5,335 1,2,...100ML-100K 943 1,682 100,000 1,2,...,5ML-1M 6,040 3,706 1,000,209 1,2,...,5ML-10M 69,878 10,677 10,000,054 0.5,1,...,5
In our experiments, we use graph attention network (GAT) [28]as the GNN model, and use dot product to implement the ratingpredictor. The user and item embeddings and their hidden represen-tations learned by graph neural networks are 256-dim. The epochthreshold 𝑇 is 2. The gradient clipping threshold 𝛿 is set to 0.1, andthe strength of Laplacian noise in the LDP module is set to 0.2 toachieve 1-differential privacy. The number of pseudo interacteditems is set to 1,000. The number of users used in each round ofmodel training is 128, and the total number of epoch is 3. The ratioof dropout [25] is 0.2. SGD is selected as the optimization algorithm,and its learning rate is 0.01. The splits of datasets are the same asthose used in [2], and these hyperparameters are selected accordingto the validation performance. The metric used in our experimentsis rooted mean square error (RMSE), and we report the averageRMSE scores over the 10 repetitions. First, we compare the performance of our
FedGNN approach withseveral recommendation methods based on centralized storage ofuser data and several privacy-preserving ones based on federatedlearning, including: • PMF [19], probability matrix factorization, which is a widelyused recommendation method; Smaller budget means better privacy protection. https://grouplens.org/datasets/movielens/ https://github.com/fmonti/mgcnn able 3: Performance of different methods in terms of RMSE. Results of FedGNN and the best-performed baseline are in bold. Methods Flixster Douban Yahoo ML-100K ML-1M ML-10MPMF [19] 1.375 0.886 26.6 0.965 0.883 0.856SVD++ [15] 1.155 0.869 24.4 0.952 0.860 0.834GRALS [23] 1.313 0.833 38.0 0.934 0.849 0.808sRGCNN [20] 1.179 0.801 22.4 0.922 0.837 0.789GC-MC [2]
PinSage [36] 0.945
GGNN GCN GAT0.951.001.051.101.151.20
Flixster
GGNN GCN GAT0.650.700.750.800.850.90
Douban
GGNN GCN GAT20.0021.0022.0023.0024.0025.00
Yahoo
GGNN GCN GAT0.850.880.910.940.971.00
ML-100K
GGNN GCN GAT0.800.820.840.860.880.90
ML-1M
GGNN GCN GAT0.750.780.810.840.870.90
ML-10M
W/o
Neighbor User Embedding
W/ Trainable Neighbor User Embedding W/ Fixed Neighbor User Embedding
Figure 4: Influence of second-order information and different GNN architectures. • SVD++ [15], another popular recommendation method basedon a variant of singular value decomposition; • GRALS [23], a collaborative filtering approach with graphinformation; • sRGCNN [20], a matrix completion method with recurrentmulti-graph neural networks; • GC-MC [2], a matrix completion method based on graphconvolutional autoencoders; • PinSage [36], a recommendation approach based on 2-hopGCN networks; • NGCF [31], a neural graph collaborative filter method; • FCF [1], a privacy-preserving recommendation approachbased on federated collaborative filtering; • FedMF [3], another privacy-preserving recommendation ap-proach based on secure matrix factorization.The recommendation performance of these methods is sum-marized in Table 3. We have several findings from Table 3. First,we observe that the methods which incorporate high-order infor-mation of the user-item graph (e.g.,
GC-MC , PinSage and
NGCF )achieve better performance than those based on first-order infor-mation only (
PMF ). This is probably because modeling the high-order interactions between users and items can enhance user anditem representation learning, and thereby improves the accuracy ofrecommendation. Second, compared with the methods based cen-tralized user-item interaction data storage like
GC-MC and
NGCF ,our
FedGNN approach can achieve comparable or even better per-formance. It shows that our approach can protect user privacyand meanwhile achieve satisfactory recommendation performance.Third, among the compared privacy-preserving recommendationmethods,
FedGNN achieves the best performance. This is because
FedGNN can incorporate high-order information of the user-itemgraphs, while
FCF and
FedMF cannot. Besides, our approach canprotect both ratings and user-item interaction histories, while
FCF and
FedMF can only protect ratings.
Then, we validate the effectiveness of incorporating high-orderinformation of the user-item graphs as well as the generality ofour approach. We compare the performance of
FedGNN and itsvariants with fully trainable neighbor user embeddings or withouthigh-order user-item interactions. In addition, we also comparetheir results under different implementations of their GNN models(GGNN, GCN and GAT). The results are shown in Fig. 4, whichreveals several findings. First, compared with the baseline perfor-mance reported in Table 3, the performance of
FedGNN and itsvariants implemented with other different GNN models is satis-factory. This result shows that our approach is compatible withdifferent GNN architectures. Second,
FedGNN slightly outperformsits variants based on GCN and GGNN. This may be because theGAT network can more effectively model the importance of the in-teractions between nodes than GCN and GGNN, which is beneficialfor user and item modeling. Third, the variants that can utilize thehigh-order information by using our
FedGNN framework performbetter than those without high-order information. It validates theeffectiveness of our approach in incorporating high-order infor-mation of the user-item graph into recommendation. Fourth, wefind that using fixed neighbor user embeddings that are trainedfor a certain number of iterations is slightly better than using fullytrainable ones that are updated in each iteration. This may be be-cause the neighboring user embeddings may not be accurate at the
Flixster
Douban
Yahoo
ML-100K
ML-1M
ML-10M =0.05 =0.1 =0.2 =0.05 =0.1 =0.2
RMSE:
Privacy Budget ε : Figure 5: The recommendation RMSE (left y-axis) and privacy budget 𝜖 (right y-axis) w.r.t. different clipping threshold 𝛿 andnoise strength 𝜆 . M Flixster M Douban M Yahoo M ML-100K M ML-1M M ML-10M
RMSE
Figure 6: The recommendation RMSE (left y-axis) and communication cost (right y-axis) w.r.t. different numbers of pseudointeracted items. beginning of model training, which is not beneficial for learningprecise user and item representations.
Finally, we explore the influence of three important hyperparame-ters, i.e., the gradient clip threshold 𝛿 , the strength 𝜆 of the Lapla-cian noise in the LDP module, and the number 𝑀 of the pseudointeracted items. We first compare the performance of our FedGNN approach by varying both hyperparameters, and the results areplotted in Fig. 5. According to these results, we find that the dif-ference between the model performance under 𝛿 = . 𝛿 = . 𝛿 = . 𝜆 , while the performance loss is not too heavyif 𝜆 is not too large. Thus, a moderate value of 𝜆 such as 0.2 ispreferable to achieve a good balance between privacy protectionand recommendation accuracy. We also compare the performance and communication cost of FedGNN w.r.t. different 𝑀 in Fig. 6. From Fig. 6, we observe thatthe performance is the best if 𝑀 is 0, but the user-item interactionhistories cannot be protected. According to the discussions in Sec-tion 3.5, if 𝑀 is too small the user privacy cannot be well-protected.In addition, the performance also declines because the randomlygenerated gradients will influence the accuracy of item gradients.By comparing the results on the three MovieLens datasets, we find A larger 𝜆 and smaller 𝛿 means smaller budget 𝜖 , i.e., better privacy protection. We achieve 1-differential privacy under 𝛿 = . and 𝜆 = . . We use the number of parameters to be exchanged in each iteration during modeltraining to measure the communication cost. that the rating matrix is sparser , 𝑀 may need to be larger to keepgood recommendation performance. This ie because when 𝑀 isrelatively large, the random gradients of pseudo interacted itemswill be better counteracted after aggregation and their influencewill be mitigated. However, as shown in Fig. 6, the communicationcost is also proportional to 𝑀 and it will be very heavy if 𝑀 is toolarge. Therefore, we choose 𝑀 as 1,000 to achieve good privacyprotection and recommendation performance under reasonablecommunication cost. In this paper, we propose a federated framework for privacy-preservingGNN-based recommendation, which aims to collectively train GNNmodels from decentralized user data by exploiting high-order user-item interactions in a privacy-preserving manner. Concretely, welocally train GNN model in each user client based on the localuser-item graph stored on this device. Each client uploads the lo-cally computed gradients to a server for aggregation, which arefurther sent to user clients for local updates. In addition, to protectuser-item interaction data during model training, we apply local dif-ferential privacy techniques to the local gradients to enhance userprivacy protection. Moreover, we sample pseudo interacted items toprotect the embeddings of items that users have interactions with.Besides, to incorporate high-order user-item interaction informa-tion into model learning, we propose a privacy-preserving user-itemgraph expansion method that can find neighboring users with co-interacted items and exchange their embeddings for extending theirlocal graphs. Massive experiments on six benchmark datasets show The rating density of the ML-100K, 1M, and 10M datasets is 0.0630, 0.0447 and 0.0134,respectively. hat our approach can achieve competitive performance with exist-ing methods based on centralized storage of user-item interactiondata and meanwhile effectively protect user privacy.
REFERENCES [1] Muhammad Ammad, Elena Ivannikova, Suleiman A Khan, Were Oyomno, QiangFu, Kuan Eeik Tan, and Adrian Flanagan. 2019. Federated Collaborative Filteringfor Privacy-Preserving Personalized Recommendation System. arXiv preprintarXiv:1901.09888 (2019).[2] Rianne van den Berg, Thomas N Kipf, and Max Welling. 2017. Graph convolu-tional matrix completion. arXiv preprint arXiv:1706.02263 (2017).[3] Di Chai, Leye Wang, Kai Chen, and Qiang Yang. 2020. Secure federated matrixfactorization.
IEEE Intelligent Systems (2020).[4] Woo-Seok Choi, Matthew Tomei, Jose Rodrigo Sanchez Vicarte, Pavan KumarHanumolu, and Rakesh Kumar. 2018. Guaranteeing local differential privacy onultra-low-power systems. In
ISCA . 561–574.[5] Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, and Dawei Yin.2019. Graph neural networks for social recommendation. In
WWW . 417–426.[6] Adrian Flanagan, Were Oyomno, Alexander Grigorievskiy, Kuan Eeik Tan,Suleiman A Khan, and Muhammad Ammad-Ud-Din. 2020. Federated Multi-view Matrix Factorization for Personalized Recommendations. arXiv preprintarXiv:2004.04256 (2020).[7] Suyu Ge, Chuhan Wu, Fangzhao Wu, Tao Qi, and Yongfeng Huang. 2020. GraphEnhanced Representation Learning for News Recommendation. In
WWW . 2863–2869.[8] Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, FrançoiseBeaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ram-age. 2018. Federated learning for mobile keyboard prediction. arXiv preprintarXiv:1811.03604 (2018).[9] István Hegedűs, Gábor Danner, and Márk Jelasity. 2019. Decentralized Recom-mendation Based on Matrix Factorization: A Comparison of Gossip and FederatedLearning. In
ECML-PKDD . Springer, 317–332.[10] Linmei Hu, Chen Li, Chuan Shi, Cheng Yang, and Chao Shao. 2020. Graphneural news recommendation with long-term and short-term interest modeling.
Information Processing & Management
57, 2 (2020), 102142.[11] Meng Jiang, Taeho Jung, Ryan Karl, and Tong Zhao. 2020. Federated DynamicGNN with Secure Aggregation. arXiv preprint arXiv:2009.07351 (2020).[12] Bowen Jin, Chen Gao, Xiangnan He, Depeng Jin, and Yong Li. 2020. Multi-behavior recommendation with graph convolutional networks. In
SIGIR . 659–668.[13] Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification withGraph Convolutional Networks. In
ICLR .[14] Jakub Konečn`y, H Brendan McMahan, Felix X Yu, Peter Richtárik,Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategiesfor improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).[15] Yehuda Koren. 2008. Factorization meets the neighborhood: a multifacetedcollaborative filtering model. In
KDD . 426–434.[16] Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard S. Zemel. 2016. GatedGraph Sequence Neural Networks. In
ICLR .[17] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, andBlaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Net-works from Decentralized Data. In
AISTATS . 1273–1282.[18] Guangxu Mei, Ziyu Guo, Shijun Liu, and Li Pan. 2019. Sgnn: A graph neuralnetwork based federated learning approach by hiding structure. In
IEEE Big Data .IEEE, 2560–2568.[19] Andriy Mnih and Russ R Salakhutdinov. 2008. Probabilistic matrix factorization.In
NIPS . 1257–1264. [20] Federico Monti, Michael Bronstein, and Xavier Bresson. 2017. Geometric matrixcompletion with recurrent multi-graph neural networks. In
NIPS . 3697–3707.[21] Tao Qi, Fangzhao Wu, Chuhan Wu, Yongfeng Huang, and Xing Xie. 2020. FedRec:Privacy-Preserving News Recommendation with Federated Learning. arXivpreprint arXiv:2003.09592 (2020).[22] Tao Qi, Fangzhao Wu, Chuhan Wu, Yongfeng Huang, and Xing Xie. 2020. Privacy-Preserving News Recommendation Model Training via Federated Learning. arXivpreprint arXiv:2003.09592 (2020).[23] Nikhil Rao, Hsiang-Fu Yu, Pradeep K Ravikumar, and Inderjit S Dhillon. 2015.Collaborative filtering with graph information: Consistency and scalable methods.In
NIPS . 2107–2115.[24] Hyejin Shin, Sungwook Kim, Junbum Shin, and Xiaokui Xiao. 2018. Privacyenhanced matrix factorization for recommendation with local differential privacy.
TKDE
30, 9 (2018), 1770–1782.[25] Nitish Srivastava, Geoffrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and RuslanSalakhutdinov. 2014. Dropout: a simple way to prevent neural networks fromoverfitting.
JMLR
15, 1 (2014), 1929–1958.[26] Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy.
Inter-national Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
10, 05(2002), 557–570.[27] Zhulin Tao, Yinwei Wei, Xiang Wang, Xiangnan He, Xianglin Huang, and Tat-Seng Chua. 2020. MGAT: Multimodal Graph Attention Network for Recommen-dation.
Information Processing & Management
57, 5 (2020), 102277.[28] Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, PietroLiò, and Yoshua Bengio. 2018. Graph Attention Networks. In
ICLR .[29] Hongwei Wang, Fuzheng Zhang, Mengdi Zhang, Jure Leskovec, Miao Zhao,Wenjie Li, and Zhongyuan Wang. 2019. Knowledge-aware graph neural networkswith label smoothness regularization for recommender systems. In
KDD . 968–977.[30] Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, and Tat-Seng Chua. 2019. Kgat:Knowledge graph attention network for recommendation. In
KDD . 950–958.[31] Xiang Wang, Xiangnan He, Meng Wang, Fuli Feng, and Tat-Seng Chua. 2019.Neural graph collaborative filtering. In
SIGIR . 165–174.[32] Ziyang Wang, Wei Wei, Gao Cong, Xiao-Li Li, Xian-Ling Mao, and MinghuiQiu. 2020. Global Context Enhanced Graph Neural Networks for Session-basedRecommendation. In
SIGIR . 169–178.[33] Chuhan Wu, Fangzhao Wu, Tao Qi, Suyu Ge, Yongfeng Huang, and Xing Xie.2019. Reviews meet graphs: Enhancing user and item representations for recom-mendation with hierarchical attentive graph neural network. In
EMNLP-IJCNLP .4886–4895.[34] Shu Wu, Yuyuan Tang, Yanqiao Zhu, Liang Wang, Xing Xie, and Tieniu Tan. 2019.Session-based recommendation with graph neural networks. In
AAAI , Vol. 33.346–353.[35] Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. 2019. Federated machinelearning: Concept and applications.
TIST
10, 2 (2019), 1–19.[36] Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton,and Jure Leskovec. 2018. Graph convolutional neural networks for web-scalerecommender systems. In
KDD . 974–983.[37] Jiani Zhang, Xingjian Shi, Shenglin Zhao, and Irwin King. 2019. STAR-GCN:stacked and reconstructed graph convolutional networks for recommender sys-tems. In
IJCAI . AAAI Press, 4264–4270.[38] Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang,Changcheng Li, and Maosong Sun. 2018. Graph neural networks: A review ofmethods and applications. arXiv preprint arXiv:1812.08434 (2018).[39] Ligeng Zhu, Zhijian Liu, and Song Han. 2019. Deep Leakage from Gradients. arXiv preprint arXiv:1906.08935arXiv preprint arXiv:1906.08935