Nicolas Papernot | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nicolas Papernot is active.

Explore More

Publication

Featured researches published by Nicolas Papernot.

ieee european symposium on security and privacy | 2016

The Limitations of Deep Learning in Adversarial Settings

Nicolas Papernot; Patrick D. McDaniel; Somesh Jha; Matthew Fredrikson; Z. Berkay Celik; Ananthram Swami

Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing deep neural networks to misclassify. In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs. In an application to computer vision, we show that our algorithms can reliably produce samples correctly classified by human subjects but misclassified in specific targets by a DNN with a 97% adversarial success rate while only modifying on average 4.02% of the input features per sample. We then evaluate the vulnerability of different sample classes to adversarial perturbations by defining a hardness measure. Finally, we describe preliminary work outlining defenses against adversarial samples by defining a predictive measure of distance between a benign input and a target classification.

ieee symposium on security and privacy | 2016

Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks

Nicolas Papernot; Patrick D. McDaniel; Xi Wu; Somesh Jha; Ananthram Swami

Deep learning algorithms have been shown to perform extremely well on many classical machine learning problems. However, recent studies have shown that deep learning, like other machine learning techniques, is vulnerable to adversarial samples: inputs crafted to force a deep neural network (DNN) to provide adversary-selected outputs. Such attacks can seriously undermine the security of the system supported by the DNN, sometimes with devastating consequences. For example, autonomous vehicles can be crashed, illicit or illegal content can bypass content filters, or biometric authentication systems can be manipulated to allow improper access. In this work, we introduce a defensive mechanism called defensive distillation to reduce the effectiveness of adversarial samples on DNNs. We analytically investigate the generalizability and robustness properties granted by the use of defensive distillation when training DNNs. We also empirically study the effectiveness of our defense mechanisms on two DNNs placed in adversarial settings. The study shows that defensive distillation can reduce effectiveness of sample creation from 95% to less than 0.5% on a studied DNN. Such dramatic gains can be explained by the fact that distillation leads gradients used in adversarial sample creation to be reduced by a factor of 1030. We also find that distillation increases the average minimum number of features that need to be modified to create adversarial samples by about 800% on one of the DNNs we tested.

computer and communications security | 2017

Practical Black-Box Attacks against Machine Learning

Nicolas Papernot; Patrick D. McDaniel; Ian J. Goodfellow; Somesh Jha; Z. Berkay Celik; Ananthram Swami

Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to human observers. Potential attacks include having malicious content like malware identified as legitimate or controlling vehicle behavior. Yet, all existing adversarial example attacks require knowledge of either the model internals or its training data. We introduce the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge. Indeed, the only capability of our black-box adversary is to observe labels given by the DNN to chosen inputs. Our attack strategy consists in training a local model to substitute for the target DNN, using inputs synthetically generated by an adversary and labeled by the target DNN. We use the local substitute to craft adversarial examples, and find that they are misclassified by the targeted DNN. To perform a real-world and properly-blinded evaluation, we attack a DNN hosted by MetaMind, an online deep learning API. We find that their DNN misclassifies 84.24% of the adversarial examples crafted with our substitute. We demonstrate the general applicability of our strategy to many ML techniques by conducting the same attack against models hosted by Amazon and Google, using logistic regression substitutes. They yield adversarial examples misclassified by Amazon and Google at rates of 96.19% and 88.94%. We also find that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.

military communications conference | 2016

Crafting adversarial input sequences for recurrent neural networks

Nicolas Papernot; Patrick D. McDaniel; Ananthram Swami; Richard E. Harang

Machine learning models are frequently used to solve complex security problems, as well as to make decisions in sensitive situations like guiding autonomous vehicles or predicting financial market behaviors. Previous efforts have shown that numerous machine learning models are vulnerable to adversarial manipulations of their inputs taking the form of adversarial samples. Such inputs are crafted by adding carefully selected perturbations to legitimate inputs so as to force the machine learning model to misbehave, for instance by outputting a wrong class if the machine learning task of interest is classification. In fact, to the best of our knowledge, all previous work on adversarial samples crafting for neural networks considered models used to solve classification tasks, most frequently in computer vision applications. In this paper, we investigate adversarial input sequences for recurrent neural networks processing sequential data. We show that the classes of algorithms introduced previously to craft adversarial samples misclassified by feed-forward neural networks can be adapted to recurrent neural networks. In a experiment, we show that adversaries can craft adversarial sequences misleading both categorical and sequential recurrent neural networks.

european symposium on research in computer security | 2017

Adversarial Examples for Malware Detection

Kathrin Grosse; Nicolas Papernot; Praveen Manoharan; Michael Backes; Patrick D. McDaniel

Machine learning models are known to lack robustness against inputs crafted by an adversary. Such adversarial examples can, for instance, be derived from regular inputs by introducing minor—yet carefully selected—perturbations.

ieee symposium on security and privacy | 2016

Machine Learning in Adversarial Settings

Patrick D. McDaniel; Nicolas Papernot; Z. Berkay Celik

Recent advances in machine learning have led to innovative applications and services that use computational structures to reason about complex phenomenon. Over the past several years, the security and machine-learning communities have developed novel techniques for constructing adversarial samples--malicious inputs crafted to mislead (and therefore corrupt the integrity of) systems built on computationally learned models. The authors consider the underlying causes of adversarial samples and the future countermeasures that might mitigate them.

Proceedings of the First ACM Workshop on Moving Target Defense | 2014

Security and Science of Agility

Patrick D. McDaniel; Trent Jaeger; Thomas F. La Porta; Nicolas Papernot; Robert J. Walls; Alexander Kott; Lisa M. Marvel; Ananthram Swami; Prasant Mohapatra; Srikanth V. Krishnamurthy; Iulian Neamtiu

Moving target defenses alter the environment in response to adversarial action and perceived threats. Such defenses are a specific example of a broader class of system management techniques called system agility. In its fullest generality, agility is any reasoned modification to a system or environment in response to a functional, performance, or security need. This paper details a recently launched 10-year Cyber-Security Collaborative Research Alliance effort focused in-part on the development of a new science of system agility, of which moving target defenses are a central theme. In this context, the consortium seeks to address the questions of when, what, and how to employ changes to improve the security of an environment, as well as consider how to measure and weigh the effectiveness of different approaches to agility. We discuss several fundamental challenges in developing and using MTD maneuvers, and outline several broad classes of mechanisms that can be used to implement them. We conclude by detailing specific MTD mechanisms used to adaptively quarantine vulnerable code in Android applications, and consider ways of comparing cost and payout of its use.

ieee computer security foundations symposium | 2017

On the Protection of Private Information in Machine Learning Systems: Two Recent Approches

Martín Abadi; Úlfar Erlingsson; Ian J. Goodfellow; H. Brendan McMahan; Ilya Mironov; Nicolas Papernot; Kunal Talwar; Li Zhang

The recent, remarkable growth of machine learning has led to intense interest in the privacy of the data on which machine learning relies, and to new techniques for preserving privacy. However, older ideas about privacy may well remain valid and useful. This note reviews two recent works on privacy in the light of the wisdom of some of the early literature, in particular the principles distilled by Saltzer and Schroeder in the 1970s.

computer and communications security | 2018

Detection under Privileged Information

Z. Berkay Celik; Patrick D. McDaniel; Rauf Izmailov; Nicolas Papernot; Ryan Sheatsley; Raquel Alvarez; Ananthram Swami

For well over a quarter century, detection systems have been driven by models learned from input features collected from real or simulated environments. An artifact (e.g., network event, potential malware sample, suspicious email) is deemed malicious or non-malicious based on its similarity to the learned model at runtime. However, the training of the models has been historically limited to only those features available at runtime. In this paper, we consider an alternate learning approach that trains models using privileged information--features available at training time but not at runtime--to improve the accuracy and resilience of detection systems. In particular, we adapt and extend recent advances in knowledge transfer, model influence, and distillation to enable the use of forensic or other data unavailable at runtime in a range of security domains. An empirical evaluation shows that privileged information increases precision and recall over a system with no privileged information: we observe up to 7.7% relative decrease in detection error for fast-flux bot detection, 8.6% for malware traffic detection, 7.3% for malware classification, and 16.9% for face recognition. We explore the limitations and applications of different privileged information techniques in detection systems. Such techniques provide a new means for detection systems to learn from data that would otherwise not be available at runtime.

Communications of The ACM | 2018

Making machine learning robust against adversarial inputs

Ian J. Goodfellow; Patrick D. McDaniel; Nicolas Papernot

Such inputs distort how machine-learning-based systems are able to function in the world as it is.Such inputs distort how machine-learning-based systems are able to function in the world as it is.

Explore More