Rachel Greenstadt | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rachel Greenstadt is active.

Explore More

Publication

Featured researches published by Rachel Greenstadt.

ieee symposium on security and privacy | 2012

Detecting Hoaxes, Frauds, and Deception in Writing Style Online

Sadia Afroz; Michael Brennan; Rachel Greenstadt

In digital forensics, questions often arise about the authors of documents: their identity, demographic background, and whether they can be linked to other documents. The field of stylometry uses linguistic features and machine learning techniques to answer these questions. While stylometry techniques can identify authors with high accuracy in non-adversarial scenarios, their accuracy is reduced to random guessing when faced with authors who intentionally obfuscate their writing style or attempt to imitate that of another author. While these results are good for privacy, they raise concerns about fraud. We argue that some linguistic features change when people hide their writing style and by identifying those features, stylistic deception can be recognized. The major contribution of this work is a method for detecting stylistic deception in written documents. We show that using a large feature set, it is possible to distinguish regular documents from deceptive documents with 96.6% accuracy (F-measure). We also present an analysis of linguistic features that can be modified to hide writing style.

Economics of Information Security | 2004

Trusted Computing, Peer-to-Peer Distribution, and The Economics of Pirated Entertainment

Stuart E. Schechter; Rachel Greenstadt; Michael D. Smith

To thwart piracy the entertainment industry must keep distribution costs high, reduce the size of distribution networks, and (if possible) raise the cost of extracting content. However, if ‘trusted computing’ mechanisms deliver on their promises, large peer-to-peer distribution networks will be more robust against attack and trading in pirated entertainment will become safer, more reliable, and thus cheaper. Since it will always be possible for some individuals to extract content from the media on which it is stored, future entertainment may be more vulnerable to piracy than before the introduction of ‘trusted computing’ technologies.

computer and communications security | 2014

A Critical Evaluation of Website Fingerprinting Attacks

Marc Juarez; Sadia Afroz; Gunes Acar; Claudia Diaz; Rachel Greenstadt

Recent studies on Website Fingerprinting (WF) claim to have found highly effective attacks on Tor. However, these studies make assumptions about user settings, adversary capabilities, and the nature of the Web that do not necessarily hold in practical scenarios. The following study critically evaluates these assumptions by conducting the attack where the assumptions do not hold. We show that certain variables, for example, users browsing habits, differences in location and version of Tor Browser Bundle, that are usually omitted from the current WF model have a significant impact on the efficacy of the attack. We also empirically show how prior work succumbs to the base rate fallacy in the open-world scenario. We address this problem by augmenting our classification method with a verification step. We conclude that even though this approach reduces the number of false positives over 63\%, it does not completely solve the problem, which remains an open issue for WF attacks.

privacy enhancing technologies | 2012

Use fewer instances of the letter i: toward writing style anonymization

Andrew W. E. McDonald; Sadia Afroz; Aylin Caliskan; Ariel Stolerman; Rachel Greenstadt

This paper presents Anonymouth, a novel framework for anonymizing writing style. Without accounting for style, anonymous authors risk identification. This framework is necessary to provide a tool for testing the consistency of anonymized writing style and a mechanism for adaptive attacks against stylometry techniques. Our framework defines the steps necessary to anonymize documents and implements them. A key contribution of this work is this framework, including novel methods for identifying which features of documents need to change and how they must be changed to accomplish document anonymization. In our experiment, 80% of the user study participants were able to anonymize their documents in terms of a fixed corpus and limited feature set used. However, modifying pre-written documents were found to be difficult and the anonymization did not hold up to more extensive feature sets. It is important to note that Anonymouth is only the first step toward a tool to acheive stylometric anonymity with respect to state-of-the-art authorship attribution techniques. The topic needs further exploration in order to accomplish significant anonymity.

ieee symposium on security and privacy | 2014

Doppelgänger Finder: Taking Stylometry to the Underground

Sadia Afroz; Aylin Caliskan Islam; Ariel Stolerman; Rachel Greenstadt; Damon McCoy

Stylometry is a method for identifying anonymous authors of anonymous texts by analyzing their writing style. While stylometric methods have produced impressive results in previous experiments, we wanted to explore their performance on a challenging dataset of particular interest to the security research community. Analysis of underground forums can provide key information about who controls a given bot network or sells a service, and the size and scope of the cybercrime underworld. Previous analyses have been accomplished primarily through analysis of limited structured metadata and painstaking manual analysis. However, the key challenge is to automate this process, since this labor intensive manual approach clearly does not scale. We consider two scenarios. The first involves text written by an unknown cybercriminal and a set of potential suspects. This is standard, supervised stylometry problem made more difficult by multilingual forums that mix l33t-speak conversations with data dumps. In the second scenario, you want to feed a forum into an analysis engine and have it output possible doppelgangers, or users with multiple accounts. While other researchers have explored this problem, we propose a method that produces good results on actual separate accounts, as opposed to data sets created by artificially splitting authors into multiple identities. For scenario 1, we achieve 77% to 84% accuracy on private messages. For scenario 2, we achieve 94% recall with 90% precision on blogs and 85.18% precision with 82.14% recall for underground forum users. We demonstrate the utility of our approach with a case study that includes applying our technique to the Carders forum and manual analysis to validate the results, enabling the discovery of previously undetected doppelganger accounts.

IEEE Systems Journal | 2017

Active Authentication on Mobile Devices via Stylometry, Application Usage, Web Browsing, and GPS Location

Lex Fridman; Steven Weber; Rachel Greenstadt; Moshe Kam

Active authentication is the problem of continuously verifying the identity of a person based on behavioral aspects of their interaction with a computing device. In this paper, we collect and analyze behavioral biometrics data from 200 subjects, each using their personal Android mobile device for a period of at least 30 days. This data set is novel in the context of active authentication due to its size, duration, number of modalities, and absence of restrictions on tracked activity. The geographical colocation of the subjects in the study is representative of a large closed-world environment such as an organization where the unauthorized user of a device is likely to be an insider threat: coming from within the organization. We consider four biometric modalities: 1) text entered via soft keyboard, 2) applications used, 3) websites visited, and 4) physical location of the device as determined from GPS (when outdoors) or WiFi (when indoors). We implement and test a classifier for each modality and organize the classifiers as a parallel binary decision fusion architecture. We are able to characterize the performance of the system with respect to intruder detection time and to quantify the contribution of each modality to the overall performance.

Proceedings of the 2013 ACM workshop on Artificial intelligence and security | 2013

Approaches to adversarial drift

Alex Kantchelian; Sadia Afroz; Ling Huang; Aylin Caliskan Islam; Brad Miller; Michael Carl Tschantz; Rachel Greenstadt; Anthony D. Joseph; J. D. Tygar

In this position paper, we argue that to be of practical interest, a machine-learning based security system must engage with the human operators beyond feature engineering and instance labeling to address the challenge of drift in adversarial environments. We propose that designers of such systems broaden the classification goal into an explanatory goal, which would deepen the interaction with systems operators. To provide guidance, we advocate for an approach based on maintaining one classifier for each class of unwanted activity to be filtered. We also emphasize the necessity for the system to be responsive to the operators constant curation of the training set. We show how this paradigm provides a property we call isolation and how it relates to classical causative attacks. In order to demonstrate the effects of drift on a binary classification task, we also report on two experiments using a previously unpublished malware data set where each instance is timestamped according to when it was seen.

adaptive agents and multi-agents systems | 2006

Experimental analysis of privacy loss in DCOP algorithms

Rachel Greenstadt; Jonathan P. Pearce; Emma Bowring; Milind Tambe

Distributed Constraint Optimization (DCOP) is rapidly emerging as a prominent technique for multiagent coordination. Unfortunately, rigorous quantitative evaluations of privacy loss in DCOP algorithms have been lacking despite the fact that agent privacy is a key motivation for applying DCOPs in many applications. Recently, Maheswaran et al. [3, 4] introduced a framework for quantitative evaluations of privacy in DCOP algorithms, showing that early DCOP algorithms lose more privacy than purely centralized approaches and questioning the motivation for applying DCOPs. Do state-of-the art DCOP algorithms suffer from a similar shortcoming? This paper answers that question by investigating the most efficient DCOP algorithms, including both DPOP and ADOPT.

computer and communications security | 2008

Cognitive security for personal devices

Rachel Greenstadt; Jacob Beal

Humans should be able to think of computers as extensions of their body, as craftsmen do with their tools. Current security models, however, are too unlike those used in human minds-for example, computers authenticate users by challenging them to repeat a secret rather than by continually observing the many subtle cues offered by their appearance and behavior. We propose two lines of research that can be combined to produce cognitive security on computers and other personal devices: continuously deployed multi-modal biometrics and adjustably autonomous security.

ieee international conference semantic computing | 2012

Translate Once, Translate Twice, Translate Thrice and Attribute: Identifying Authors and Machine Translation Tools in Translated Text

Aylin Caliskan; Rachel Greenstadt

In this paper, we investigate the effects of machine translation tools on translated texts and the accuracy of authorship and translator attribution of translated texts. We show that the more translation performed on a text by a specific machine translation tool, the more effects unique to that translator are observed. We also propose a novel method to perform machine translator and authorship attribution of translated texts using a feature set that led to 91.13% and 91.54% accuracy on average, respectively. We claim that the features leading to highest accuracy in translator attribution are translator-dependent features and that even though translator-effect-heavy features are present in translated text, we can still succeed in authorship attribution. These findings demonstrate that stylometric features of the original text are preserved at some level despite multiple consequent translations and the introduction of translator-dependent features. The main contribution of our work is the discovery of a feature set used to accurately perform both translator and authorship attribution on a corpus of diverse topics from the twenty-first century, which has been consequently translated multiple times using machine translation tools.

Explore More