[PDF] Privacy-preserving Cloud-based DNN Inference

Abstract

Deep learning as a service (DLaaS) has been intensively studied to facilitate the wider deployment of the emerging deep learning applications. However, DLaaS may compromise the privacy of both clients and cloud servers. Although some privacy preserving deep neural network (DNN) based inference techniques have been proposed by composing cryptographic primitives, the challenges on computational efficiency have not been well-addressed due to the complexity of DNN models and expensive cryptographic primitives. In this paper, we propose a novel privacy preserving cloud-based DNN inference framework (namely, "PROUD"), which greatly improves the computational efficiency. Finally, we conduct extensive experiments on two commonly-used datasets to validate both effectiveness and efficiency for the PROUD, which also outperforms the state-of-the-art techniques.

Full PDF

PPRIVACY-PRESERVING CLOUD-BASED DNN INFERENCE

Shangyu Xie, Bingyu Liu and Yuan Hong

Illinois Institute of Technology { sxie14, bliu40 } @hawk.iit.edu, [email protected] ABSTRACT

Deep learning as a service (DLaaS) has been intensively stud-ied to facilitate the wider deployment of the emerging deeplearning applications. However, DLaaS may compromise theprivacy of both clients and cloud servers. Although some pri-vacy preserving deep neural network (DNN) techniques havebeen proposed by composing cryptographic primitives, thechallenges on computational efﬁciency have not been fullyaddressed due to the complexity of DNN models and expen-sive cryptographic primitives. In this paper, we propose anovel privacy preserving cloud-based DNN inference frame-work (“PROUD”), which greatly improves the computationalefﬁciency. Finally, we conduct experiments on two datasets tovalidate the effectiveness and efﬁciency for the PROUD whilebenchmarking with the state-of-the-art techniques.

1. INTRODUCTION

Deep neural network (DNN) models have been frequentlydeployed in a wide variety of real world applications, suchas image classiﬁcation [1], video recognition [2] and voiceassistant (e.g., Apple Siri and Google Assistant). Meanwhile,cloud computing technologies (e.g., Microsoft Azure Ma-chine Learning, Google Inference API, and Amazon AWSMachine Learning) have promoted the deep learning as aservice (DLaaS) to make DNNs widely accessible. Userscan outsource their own data for inferences based on the pre-trained DNN models provided by the cloud service provider.However, severe privacy concerns may arise in such appli-cations. First, if the data of the clients are explicitly disclosedto the cloud, sensitive personal information included in theoutsourced data would be leaked. Second, if the ﬁne-tunedDNN models are shared for inferences [3], the parametersmight be reconstructed by untrusted parties [4]. To addresssuch privacy concerns, several recent works [5, 6, 7, 8] haveproposed cryptographic protocols to ensure privacy in infer-ences via garbled circuits [9] and/or homomorphic encryp-tion [10]), which rely on expensive cryptographic primitives.Then, such protocols may result in fairly high computationand communication overheads. Since the volume of the out-sourced data grows rapidly and the DNN models usually re-quire high computational resources in the cloud, such tech-niques may not be suitable for practical deployment due to limited scalability. Thus, we are seeking an efﬁcient schemeto securely implement the DNN inferences in the cloud.Speciﬁcally, we propose a privacy-preserving cloud-basedDNN inference framework (“PROUD”) by co-designing thecryptographic primitives, deep learning, and cloud comput-ing technologies. We mainly take advantage of a novel ma-trix permutation with ciphertext packing and parallelizationto improve the computational efﬁciency of linear layers. Withthe privacy guarantee provided via homomorphic encryption,PROUD supports all types of non-linear activation functionsby leveraging an interactive paradigm. Above all, PROUD in-tegrates the cloud container technology to further improve theperformance via parallel execution, which can also be readilyadapted for various DNNs via conﬁguring container images.

2. THE PROUD SYSTEM

Figure 1 illustrates the framework of the proposed systemfor the users (clients) and the cloud service provider (cloudserver). The client locally holds the private data, which willbe encrypted with the client’s public key and sent to the cloudserver. Then, the cloud server initializes container instances(pre-compiled with secure protocols, i.e., MatF and NlnF) toexecute the DNN inference with the encrypted input. Finally,the client will decrypt and receive the classiﬁcation result.

Automated Backend Execution . The backend system canautomatically deploy the cryptographic protocol for the se-cure data inference in the cloud. Speciﬁcally, once the serverreceives encrypted data from the client, it will compose theconﬁguration ﬁle to initialize a bunch of container instancesvia a pre-compiled image (with the source codes), where thesecure protocols (i.e., MatF and NlnF) will start to be exe-cuted for DNN inference until the ﬁnal result is returned. Theautomation of the backend ensures that the secure protocolscan be delivered efﬁciently, and enables the full system to becapable of processing a large number of clients (if necessary).

3. THE PROUD PROTOCOL DESIGN3.1. Problem Formulation

The PROUD will securely compute the DNN model with en-crypted inputs in the cloud. We ﬁrst denote an (cid:96) -layer DNN a r X i v : . [ c s . CR ] F e b Video StreamsResults

PolyF + MatF … V i d e o C li p s C on t a i n e r M odu l e C l a ss i f i e r Cloud Server

Client … Build-in Network

ContainerInstance

ImageCreate ContainerConfig.jsonMatFPolyF … Secure Protocol Result

Backend

Input Data

MatF+NlnFResults

Client

Fig. 1 . The PROUD Framework.model as M = { L i , i ∈ [1 , (cid:96) ] } , and the input video as V .The inference model M can be viewed as a complex function f ( · ) integrating linear functions (corresponding to linear lay-ers, e.g., convolution layers and fully-connected layers) andnon-linear functions (activation functions, e.g., Sigmoid andReLu). Denoting the inference result as S , we have: S = f ( V ) = L (cid:96) ( L (cid:96) − ( · · · L ( L ( V )) · · · )) (1) Threat Model . We consider semi-honest model where bothparties are honest to execute the protocol but are curious tolearn private information. PROUD can preserve privacy forboth parties against possible leakage: (1) client’s private inputvideos are not leaked to the cloud service provider; (2) cloudservice provider’s DNN model (e.g., linear/non-linear weightparameters, and bias values) is not revealed to the client in thecomputation. We also assume that all the communications areexecuted in a secure and authenticated channel.

Algorithm 1 illustrates the protocol for PROUD. In the initial-ization phase, the client generates a key pair and encrypts theprivate data V (Line 1) while the server prepares the compu-tation for the DNN functions (Equation 1) with two subproto-cols: (1) MatF for the linear functions; (2) NlnF for the non-linear activation functions (Line 2). With such two subproto-cols, PROUD will be jointly executed by both the client andserver. Speciﬁcally, the server can perform computation ofthe linear layers directly on the encrypted data received fromthe client using the subprotocol MatF (Line 5). For the non-linear layers, the output data will be sent back to the clientfor computation by the subprotocol NlnF (Line 6), and thenthe client will re-encode and encrypt the data to be sent tothe server for next layer’s computation. Once completing thecomputations of all the layers in the DNN model, the clientwill receive the ciphertext and decrypt it to get the classiﬁca-tion result. The details of two subprotocols will be illustratedin Section 3.3 and 3.4, respectively. To ensure privacy for the linear layers, a naive method is toapply homomorphic encryption (HE) to the arithmetic op-

Algorithm 1:

PROUD Protocol

Input:

Input Data V , M Output:

Classiﬁcation Result S Client: Encode and encrypt V to get τ Server: (MatF, NlnF) ← M for i ∈ [1 , (cid:96) ] do switch L i do Case

Linear : τ i ← MatF ( τ i − ) Case

Non-Linear : τ i ← NlnF ( τ i − ) Client: Decrypts τ (cid:96) to get S erations of encrypted matrices (e.g., fully-connected layer),which might be inefﬁcient since the input data tensors are usu-ally high-dimensional. To mitigate such issue, our PROUDsystem utilizes a novel matrix permutation method [3] to ef-ﬁciently perform matrix computations with ciphertext pack-ing and parallelization [11], where the matrix multiplicationequals the sum of the component-wise products for some spe-ciﬁc permutations of the matrices themselves.Given the input matrix V , the linear layer (matrix) W andbias parameter B , PROUD will securely compute the func-tion of a linear layer as: W ∗ V + B (w.l.o.g., we considerthe fully-connected layer with bias while W and V are twosquare matrices with size n × n ). We illustrate an exampleof the square matrix as A (of size n × n ). To compute themultiplication, the server will ﬁrst ﬁnd n permutations of thematrix A via the following symmetric permutations: σ ( A ) i,j = A i,i + j , τ ( A ) i,j = A i + j,j (2) φ ( A ) i,j = A i,j +1 , ψ ( A ) i,j = A i +1 ,j (3)Note that φ, ψ are the column and row shifting operations.Then, we can compute the product for W and V as below: W ∗ V = n − (cid:88) k =0 W k (cid:12) V k (4) where W k = φ k ( σ ( W )) , V k = ψ k ( τ ( B )) , (cid:12) indicatesthe component-wise product and k is the number of perturba-tions, e.g., ψ k will perform k times ψ ( · ) permutation on thematrix. We denote the function permut ( · ) to compute the n permutation matrices of one matrix. Ciphertext Packing and Parallelization . To improve the ef-ﬁciency, we also leverage the vectorable homomorphic en-cryption (aka. “Ciphertext Packing”) [3, 5], which transformsa matrix of size d × d to a single vector (plaintext) via an en-coding map function, denoted as Encode . In particular, the

Decode function transforms the vector plaintext back to thematrix form. For simplicity of notations, we denote the en-cryption, evaluation, and decryption functions under an HEscheme as

Enc () , Eval () and Dec () , respectively.Then, the component-wise product (Equation 4) of the ci-phertexts V k and W k , denote as Enc ( pk, O k ) , can be securelycomputed with the multiplicative property of the HE: val ( pk, Encode ( W ( l,m ) k ) , Enc ( Encode ( V ( l,m ) k )) , ∗ ) (5)where l, m ∈ [1 , n ] are the entry indices of the matrices W and V , and pk is the public key. Then, the sum of all the n component-wise products of the matrices W k and V k can becomputed using the additive property of HE. Finally, the biasparameter B can be computed using the additive property ofHE. The protocol is detailed in Algorithm 2.Given a large number of plaintexts to be encrypted by ci-phertext packing, we further expedite the matrix computationwith the parallelization [3]. To this end, we modify the encod-ing map function to “1-to-1 map” such that an n -dimensionalvector can be transformed into a g -tuple of square matrices oforder d , where g = n/d . This parallelization technique canalso be realized with the parallel computation in the cloudframework (using a bunch of containers), which results in areduced computational complexity O ( d/g ) per matrix. Algorithm 2:

MatF

Input:

Input V , Weighted Matrix W , Bias B Output: O = Enc ( pk, W ∗ V + B ) { V k } n − k =0 ← Enc ( pk, Encode ( permut ( V ))) { W k } n − k =0 ← Encode ( permut ( W )) for k ∈ [0 , n − do O k ← Eval ( pk, W ( l,m ) k , V ( l,m ) k ) , ∗ ) Enc ( pk, O ) ← Eval ( pk, { O k , k ∈ [0 , n − } , +) return Eval ( pk, Enc ( pk, O ) , B, +) The NlnF protocol securely computes the non-linear layersof DNNs. Most of the existing works depend on either gar-bled circuits [5] or replacing square function [3], which mayarouse high computational overheads or reduce the accuracy.In our protocol, the computation of the non-linear function(e.g., ReLu) is executed at the client side with the input of de-crypted data to preserve privacy. Algorithm 3 shows that theclient will ﬁrst decrypt the received output of MatF from theserver with its private key. Then, the client will compute theoutput of the non-linear function φ and return the output tothe server for the computation of next network layer. Duringthe execution of this protocol, the client does not leak any pri-vate information to the server and the server does not exposesensitive weight parameters to the client. Algorithm 3:

NlnF

Input:

Input V (from MatF), Activation Function φ ( · ) Output: O = Server: sends V to the client Client: r ← Decode ( Dec ( sk, V )) return O ← φ ( r ) Security and Practicality . For the linear computations(MatF), the server will not know the plaintext since all thecomputations are performed on the ciphertexts (“no leakage”can be theoretically proven). For the non-linear computations,the client receives some encrypted intermediate results fromthe server, and decrypts them to get some trivial intermediatedata (which does not result in privacy leakage). Such trivialnon-private data release is traded for a light-weight crypto-graphic protocol, which is far more efﬁcient than other cryp-tographic protocols built on secure polynomial approximationand/or garbled circuits. Since the protocol is composed in-dependently, many neural network based applications (e.g.,image classiﬁcation [1] and natural language processing [12])or video learning models (e.g., C3D [2] and I3D [13]) can bereadily integrated into our system. The pre-trained DNNs canbe adapted with appropriate extensions, and integrated intothe PROUD protocol (for feature extraction and/or inferenceson the encrypted data). Moreover, the PROUD system can beeasily integrated into the practical cloud platform (e.g., AWS)since the PROUD is a cloud-based prototype of system.

4. EXPERIMENTSExperimental Setup . Our system is implemented on the NSFCloudLab platform in which one machine works as the clientand the other one works as the server. Both machines haveeight 64-bit ARMv8 cores with 2.4GHZ, 64GB memory in-stalled with Ubuntu 16.04. We implement the homomorphicencryption in HEANN [11] (which realizes the optimal com-putation over real numbers) for secure matrix operations. Weleverage Docker to develop the prototype for PROUD: the im-age of the container (all the source codes) is pre-compiledwith the speciﬁc functions (i.e., MatF and NlnF) in Python.We evaluate our framework on the two datasets: (1)MNIST dataset [14] includes 70K handwritten images of size × under the gray level 0-255; (2) IDC dataset for inva-sive ductal carcinoma (IDC) classiﬁcation (IDC-negative orpositive), which contains about 28K patches of × pix-els. We employ the LeNet5 [14] as the test network model.In addition, we compare the performance of PROUD withfour representative schemes (CryptoNets [6], GAZELLE [5],BAYHENN [7] and DELPHI [15]) on the MNIST and IDCdataset for image classiﬁcation. Results . All the results on the two datasets are shown in Ta-ble 1 and 2, respectively. From the Table 1, we can observethat our PROUD results in the least average latency (e.g., 13times faster than GAZELLE) and communication overheadsfor digit classiﬁcation, compared with other three existingschemes. PROUD signiﬁcantly outperforms other schemesconsidering we adopt a highly light-weight matrix compu- Table 1 . Benchmarking on MNIST dataset

Framework Accuracy (%) Latency (s) Comm. (MB)CryptoNets 81.25 1942.6 1621.3GAZELLE 83.74 24.64 263.4BAYHENN 83.26 9.36 67.32DELPHI 82.72 2.47 2.95PROUD 84.01 1.12 3.27

Table 2 . Benchmarking on IDC datasettation scheme compared with the existing schemes (includ-ing garbled circuits and heavily encrypting matrices). As forthe classiﬁcation accuracy, PROUD works almost identical asGAZELLE (in which the optimal approximation of non-linearfunction achieves the negligible loss using the original acti-vation function). It is worth noting that CryptoNets performsthe worst, since it replaces all the activation functions with thesquare functions, and all the pooling functions with sum pool-ing, which also greatly increase the computational overheadand arouse the high communication bandwidth (the larger ci-phertext size). BAYHENN uses a different Bayesian infer-ence model with some randomness for DNN, which decreasesthe classiﬁcation accuracy to some extent. Also, consideringthat the DELPHI’s computation overheads are mainly in theofﬂine preparation (heavy cryptographic computations), theonline computation overhead is reduced. Table 2 shows simi-lar results for IDC classiﬁcation. All of these results illustratethat our proposed framework can signiﬁcantly improve thecomputational efﬁciency of secure DNN inference comparedwith other state-of-the-art techniques.We also illustrate the evaluation results of latency andcommunication bandwidth result for each step of PROUDprocessing one image instance in Table 3. Speciﬁcally, theclient takes about 23.4 ms, including the runtime for encod-ing and encrypting the image. Then, the server initializesthe DNN model by taking 107.2 ms (note that this step canbe processed simultaneously as the ﬁrst step at the client’s).Moreover, we also observe that DNN computation in theserver dominates the latency. Regarding the communica-tion overheads, it mainly occurs when the client sends theencrypted images to the server (0.58MB). Moreover, therearouses communication consumption during the DNN infer-ence since NlnF protocol follows an interactive paradigm.

5. RELATED WORK

The generic secure computation techniques (e.g., secure two-party computation [9, 16], fully homomorphic encryption

Phase Latency (ms) Comm. (MB)Client Encode + Encry. 23.4 0.58Server Set Model 107.2 -Server DNN Computation 410.8 0.34Client Decry.+ Decode 2.7 0.03Total 544.1 0.95

Table 3 . Performance of PROUD on MNIST dataset[17] and secret sharing [18]) can be directly used to tackle theprivacy concerns in DNN inferences. However, such cryp-tographic primitives would request high computation andcommunication overheads. For instance, the size of garbledcircuits in the MPC protocols will exponentially grow asthe number of parties increases. They also require multiplerounds of communications among the parties. Recently, al-though there are multiple works that improve the efﬁciencyof FHE [19, 20, 21], the high computational costs still makethem impractical for performing inferences.Therefore, it seems to be necessary to design speciﬁc pro-tocols for secure learning. There have been several workson designing speciﬁc secure protocols for DNN models[6, 22, 5, 7]. SecureML [22] is one of the ﬁrst systemswhich focuses on machine learning on encrypted data withNN model. However, it mainly depends on the generic two-party protocols with very poor performance. Jiang et al. [3]proposed an efﬁcient secure matrix computation protocol toimprove the performance for the computation with neural net-works. However, it only supports limited activation functions(e.g., only the square function in the case study). GAZELLE[5] composes the protocol on the heavy garbled circuits whileBAYHENN [7] leverages inefﬁcient ciphertext packing ofmatrix for linear computations. Although DELPHI improvesthe bandwidth of online protocol, it still depends on the off-line computation and neural architecture search (NAS).

6. CONCLUSION

We have proposed a novel privacy preserving DNN inferenceframework with cloud container technology which ensuresboth privacy protection and high efﬁciency under complexneural network settings. It employs efﬁcient homomorphicmatrix operation to securely execute inference interactively.Furthermore, we have designed and implemented the proto-type for PROUD. Finally, we conducted experiments to eval-uate the performance using two common datasets. PROUDhas been shown to outperform the existing schemes, and canbe readily integrated into the practical cloud infrastructure.

Acknowledgments

This work is partially supported by the NSF under Grant No.CNS-1745894. The authors would like to thank the anony-mous reviewers for their constructive comments. . REFERENCES [1] Weibo Liu, Zidong Wang, Xiaohui Liu, Nianyin Zeng,Yurong Liu, and Fuad E Alsaadi, “A survey of deep neu-ral network architectures and their applications,”

Neu-rocomputing , vol. 234, pp. 11–26, 2017.[2] Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Tor-resani, and Manohar Paluri, “Learning spatiotemporalfeatures with 3d convolutional networks,” in

Proceed-ings of the IEEE international conference on computervision , 2015, pp. 4489–4497.[3] Xiaoqian Jiang, Miran Kim, Kristin Lauter, and Yong-soo Song, “Secure outsourced matrix computation andapplication to neural networks,” in

Proceedings of the2018 ACM SIGSAC Conference on Computer and Com-munications Security . ACM, 2018, pp. 1209–1222.[4] Florian Tram`er, Fan Zhang, Ari Juels, Michael K Re-iter, and Thomas Ristenpart, “Stealing machine learningmodels via prediction apis,” in , 2016, pp. 601–618.[5] Chiraag Juvekar, Vinod Vaikuntanathan, and AnanthaChandrakasan, “Gazelle: A low latency framework forsecure neural network inference,” in , 2018, pp. 1651–1669.[6] Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine,Kristin Lauter, Michael Naehrig, and John Wernsing,“Cryptonets: Applying neural networks to encrypteddata with high throughput and accuracy,” in

Interna-tional Conference on Machine Learning , 2016, pp. 201–210.[7] Peichen Xie, Bingzhe Wu, and Guangyu Sun, “Bay-henn: Combining bayesian deep learning and homo-morphic encryption for secure dnn inference,” arXivpreprint arXiv:1906.00639 , 2019.[8] Qiao Zhang, Cong Wang, Hongyi Wu, Chunsheng Xin,and Tran V Phuong, “Gelu-net: A globally encrypted,locally unencrypted deep neural network for privacy-preserved learning.,” .[9] A. C. Yao, “How to generate and exchange secrets,”in , Oct 1986, pp. 162–167.[10] Pascal Paillier, “Public-key cryptosystems based oncomposite degree residuosity classes,” in

InternationalConference on the Theory and Applications of Crypto-graphic Techniques . Springer, 1999, pp. 223–238.[11] Jung Hee Cheon, Andrey Kim, Miran Kim, and Yong-soo Song, “Homomorphic encryption for arithmetic of approximate numbers,” in

International Conference onthe Theory and Application of Cryptology and Informa-tion Security . Springer, 2017, pp. 409–437.[12] Jacob Devlin, Ming-Wei Chang, Kenton Lee, andKristina Toutanova, “Bert: Pre-training of deep bidirec-tional transformers for language understanding,” arXivpreprint arXiv:1810.04805 , 2018.[13] Joao Carreira and Andrew Zisserman, “Quo vadis, ac-tion recognition? a new model and the kinetics dataset,”in proceedings of the IEEE Conference on Computer Vi-sion and Pattern Recognition , 2017, pp. 6299–6308.[14] Yann LeCun, L´eon Bottou, Yoshua Bengio, PatrickHaffner, et al., “Gradient-based learning applied to doc-ument recognition,”

Proceedings of the IEEE , vol. 86,no. 11, pp. 2278–2324, 1998.[15] Pratyush Mishra, Ryan Lehmkuhl, Akshayaram Srini-vasan, Wenting Zheng, and Raluca Ada Popa, “Delphi:A cryptographic inference service for neural networks,”in , 2020.[16] O Goldreich, S Micali, and A Wigderson, “How to playany mental game,” in

Proceedings of the nineteenth an-nual ACM symposium on Theory of computing , 1987,pp. 218–229.[17] Craig Gentry, “Fully homomorphic encryption usingideal lattices,” in

Proceedings of the Forty-ﬁrst AnnualACM Symposium on Theory of Computing , New York,NY, USA, 2009, STOC ’09, pp. 169–178, ACM.[18] Elette Boyle, Geoffroy Couteau, Niv Gilboa, YuvalIshai, and Michele Orr`u, “Homomorphic secret shar-ing: optimizations and applications,” in

Proceedings ofthe 2017 ACM SIGSAC Conference on Computer andCommunications Security , 2017, pp. 2105–2122.[19] Junfeng Fan and Frederik Vercauteren, “Somewhatpractical fully homomorphic encryption,” CryptologyePrint Archive, Report 2012/144, 2012.[20] Shai Halevi and Victor Shoup, “Faster homomorphiclinear transformations in helib,” in

Annual InternationalCryptology Conference . Springer, 2018, pp. 93–120.[21] Shai Halevi, Yuriy Polyakov, and Victor Shoup, “An im-proved rns variant of the bfv homomorphic encryptionscheme,” in

Cryptographers’ Track at the RSA Confer-ence . Springer, 2019, pp. 83–105.[22] Payman Mohassel and Yupeng Zhang, “Secureml: Asystem for scalable privacy-preserving machine learn-ing,” in2017 IEEE Symposium on Security and Privacy(SP)