Haystack: A Multi-Purpose Mobile Vantage Point in User Space
Abbas Razaghpanah, Narseo Vallina-Rodriguez, Srikanth Sundaresan, Christian Kreibich, Phillipa Gill, Mark Allman, Vern Paxson
HHaystack: A Multi-Purpose Mobile Vantage Point in User Space
Abbas Razaghpanah
Stony Brook University
Narseo Vallina-Rodriguez
ICSI
Srikanth Sundaresan
ICSI
Christian Kreibich
ICSI / Lastline
Phillipa Gill
Stony Brook University
Mark Allman
ICSI
Vern Paxson
ICSI / UC Berkeley
Abstract
Despite our growing reliance on mobilephones for a wide range of daily tasks, their operationremains largely opaque. A number of previous stud-ies have addressed elements of this problem in a par-tial fashion, trading off analytic comprehensiveness anddeployment scale. We overcome the barriers to large-scale deployment ( e.g., requiring rooted devices) andcomprehensiveness of previous efforts by taking a novelapproach that leverages the VPN API on mobile devicesto design Haystack, an in-situ mobile measurement plat-form that operates exclusively on the device, providingfull access to the device’s network traffic and local con-text without requiring root access. We present the de-sign of Haystack and its implementation in an Androidapp that we deploy via standard distribution channels.Using data collected from 450 users of the app, we ex-emplify the advantages of Haystack over the state of theart and demonstrate its seamless experience even underdemanding conditions. We also demonstrate its utilityto users and researchers in characterizing mobile trafficand privacy risks.
1. INTRODUCTION
Mobile phones have become indispensable aids to ev-eryday life by offering users capabilities that rival thoseof general purpose computers. However, these systemsremain notoriously opaque, as mobile operating systemstightly control access to system resources. While thistight control is useful in preventing unwanted applica-tion activity, it also imposes hurdles for understandingthe behavior of mobile devices, especially their networkactivity and performance.Despite these challenges, the research community hasmade steady progress in understanding mobile apps andmobile traffic over the past few years, by using twobroad classes of techniques. One class is lab-orientedand uses static and dynamic analysis of app sourcecode [22, 56], controlled execution of apps [24, 38] anddynamic analysis [68], even modifying the OS kernel totrack app behavior [23]. A contrasting approach lever-ages network traces obtained from ISPs [27,62] or VPNtunnels that forward user traffic [52] to servers in the cloud for observation. However, each of these previousapproaches faces a trade-off: • Approaches based on static and dynamic analysisdo not offer access to real-world data. Thus far,studies that have used these approaches have beenconstrained to analysis of source code, which is notalways available, or artificial/controlled user inputswhich require significant human effort to train thetechniques, contextualize the results, and minimizefalse positives. One exception is Taintdroid [23], amodified Android version that can analyze app be-havior in real-world settings. However, this techniquerelies on operating system modifications, which incura significant engineering effort to catch up with newOS releases, and forces participating users to install anew firmware on their devices [65]. Consequently, thescale of analysis and app coverage they can achievein the wild remains limited to tens of users. • Approaches that leverage network traffic obtain visi-bility into real user behavior, at the cost of the rich-ness of context that device-centric approaches can ob-tain. For example, while this approach can captureand analyze mobile network data, heuristics mustinfer which applications generated individual flows,and detouring traffic through third-party middle-boxes complicates high-fidelity performance measure-ments due to the necessarily skewed vantage point.In this study, we present Haystack, the first on-devicemobile measurements platform that is able to passivelymonitor app behavior and network traffic under regularusage and network conditions, without requiring usersto root the phone . The latter gives Haystack the poten-tial for better scalability in deployment; users can sim-ply install the app from Google’s Play Store or similarmarkets. This provides us the opportunity to monitororganic mobile network activity as generated by realusers in real networks using real mobile apps—all fromthe vantage point of the device. This combination ofease of deployability and the high-fidelity vantage pointallows Haystack to hit a sweet spot in the trade-off be-tween scalability and richness of data. a r X i v : . [ c s . N I] O c t imilar to previous approaches [52], Haystack lever-ages Android’s standard VPN interface to capture out-bound packets from applications. However, rather thantunneling the packets to a remote VPN server for in-spection, Haystack intercepts, inspects, and forwardsthe user’s traffic to its intended destination. This ap-proach gives us raw packet-level access to outboundpackets as well as flow-level access to incoming traf-fic without modifying the network path, and withoutrequiring permissions beyond those needed by the VPNinterface. Haystack therefore has the ability to monitornetwork activity in the proper context by operating lo-cally on the device. For example, a TCP connection canbe associated with a specific DNS lookup and both canbe coupled with the originating application. Further,we design Haystack to be extensible with new analy-ses and measurements added over time ( e.g., by addingnew protocol parsers and by supporting advanced mea-surement methods such as reactive measurements [10]),and new features to attract and educate users ( e.g., adblock, malware detection, privacy leak prevention andnetwork troubleshooting).Haystack is publicly available for anyone to installon Google Play and has been installed by 450 users todate [40]. We discuss the design and implementationof Haystack in § §
4, and evaluate its performanceand resource use in §
5. Our tests show that Haystackdelivers sufficient throughput (26–55 Mbps) at low la-tency overhead (2–3 ms) to drive high-performance anddelay-sensitive applications such as HD video streamingand VoIP without noticeable performance degradationfor the user.While we consider our Haystack implementation pro-totypical in some respects (such as UI usability for non-technical users), it has already provided interesting in-sights into app usage in the wild: in §
2. RELATED WORK
Previous studies have leveraged a variety of tech-niques for understanding privacy risks of mobile appsand their behavior in the network. As noted earlier,each approach made trade-offs between having accessto real user behavior and device context. We classifythe prior work into the following four categories.
Dynamic app analysis:
This approach calls for run-ning an app in a controlled environment such as a vir-tual machine [68] or an instrumented OS [23, 38]. Theapp is then monitored as it conducts its pre-defined setof tasks, with the results indicating precisely how the app and system behave during the test ( e.g., whetherthe app exfiltrated data). While this approach providesuseful insights, the workload (which does not representreal-world operation) and difficulty of deploying customfirmware on users’ phones (sacrificing scale) means thatthe results do not directly speak to normal users’ ac-tivity. To overcome the lack of user input, studies thatrely on dynamic analysis require “UI monkeys” [11, 66]to generate synthetic user-actions.
Static app analysis:
This technique involves analy-sis of the app code, obtained by decompiling app bi-naries, via symbolic execution [67], analysis of con-trol flow graphs [16, 22], by auditing third-party libraryuse [21, 55], through inspection of the Android per-missions and their associated system calls [16, 45], andanalysis of app properties ( e.g., whether apps employsecure communications) [24, 26]. While static analy-sis typically provides good scale with analysis of over10K apps in several studies [24], (modulo computa-tional resources) this strategy does not reflect the be-havior of apps in the wild, and typically requires a goodamount of manual inspection. Furthermore, the analy-sis may under- or over-state the importance of certaincode paths since it lacks a notion of how users interactwith the apps in practice.
Passive traffic analysis:
A number of studies rely onvolunteers with rooted phones that allow their traffic toget recorded by tcpdump [25,39,51] or iptables [17,60].These methods are challenging to deploy at scale. Toobtain larger-scale data, other projects study the behav-ior of mobile devices by observing their network trafficeither at a large ISP with millions of users [27,57,62] orby forwarding traffic through a remote VPN proxy thatalso modifies the network path [44, 52, 64]. As a result,these studies contain a large variety of apps and mo-bile platforms but they lack device context for account-ability and accuracy ( e.g., mapping flows to originatingapps). While this can be alleviated by pairing a remoteVPN proxy with client-side software to provide contextto the remote VPN server [44], the solution still altersthe network path by rerouting traffic to the VPN server,hence providing an unrealistic view of the performanceaspects of real mobile traffic. PrivacyGuard [59] usesa technique similar to the one used by Haystack to in-tercept user traffic to detect simple instances of privateinformation leaks ( e.g., device ID and location), but itdoes not aim to offer the depth and versatility offeredby Haystack as a measurement platform.
Active mobile network measurements:
GooglePlay (and, on a smaller scale, the Apple Store) con-tains a number of tools for active mobile network mea-surements. Examples include Ookla’s SpeedTest [34],the FCC Speed Test [28], network scanners to buildnetwork coverage maps [35], and comprehensive mea-2 pproach Scale Real-worldoperation Comprehensiveness LocalOperation App coverage OS compatibility
ISP traces Large-scale (cid:88)
All apps All versions/platformsRemote VPN Crowdsourcing (cid:88)
Crowdsourcing All versions/platformsStatic analysis Resource-bound (cid:88) (cid:88) ∼ (cid:88) (cid:88) ∼
100 apps Limited
Table 1:
Comparison between different measurement approaches in the mobile environments. It should be noted that theseaspects/features are not easily comparable in a binary manner and the comparison provided here is merely qualitative. surement tools such as My Speed Test [30], Netalyzrfor Android [32], NameHelp [31], and MobiPerf [29].Such tools provide valuable insight into network per-formance [41, 47] and operational aspects of ISPs suchas middlebox deployment [63] and traffic discrimina-tion [42]. However, despite the fact that active measure-ment techniques typically provide an accurate snapshotof actual network conditions, they do not study networkperformance of installed apps in real-world situations.Table 1 provides a high-level comparison of each ofthe measurement approaches. As we can see, none cansimultaneously observe real-world operation while pro-viding comprehensive data at scale. This trade-off hasprevented the research community from exploring in de-tail many aspects of the mobile ecosystem. We will re-visit the comparison between Haystack and state of theart techniques in §
6, after we present its design, imple-mentation, and evaluation.
3. HAYSTACK OVERVIEW
Our goal with Haystack is to help researchers avoidthe trade off between accessing device context and theability to measure real-world phone usage at scale. Thecrux of Haystack is its ability to observe network com-munication on the mobile device itself . Since 2011 (ver-sion 4.0), Android has provided a VPN API that en-ables developers to create a virtual tun interface anddirect all network traffic on the phone to the interface’suser-space process. To enable this functionality, theclient app requests the
BIND VPN SERVICE permissionfrom the user, which, crucially, does not require a rooteddevice. The API typically drives VPN client applica-tions that forward traffic to a remote VPN server [14].Instead of relaying packets to a remote VPN server,Haystack performs two high-level operations in parallel.First, it sends a copy of the bidirectional packet streamto a background process that analyzes the traffic off-path. Second, it uses the packet headers to maintainuser-space network sockets to remote hosts and relaysdata via these sockets.Haystack is available in the Google Play Store [40]and has been downloaded a total of 450 times.A number of apps in Google Play leverage the
BIND VPN SERVICE permission for non-VPN tasks.While we are not aware of any apps taking traffic pro-cessing to the level we realize in Haystack, tPacketCap-ture [37] and SSL Packet Capture [36] take advantage ofthe VPN API to record approximate packet traces via a user-space application. As with Haystack, these tracesare approximate since the app does not have access toraw packets via Java’s socket interface. A related app,NoRoot Firewall [33] allows mobile users to block traf-fic generated by specific apps and generate connectionlogs.
Haystack’s ability to observe real-world user dataraises many ethical considerations [7]. We leverage thefact that Haystack runs on the user’s device to do thebulk of processing on the device and only send backsummary statistics ( e.g., domains contacted and pro-tocols used) and by under no-circumstances user’s rawtraffic. We aim to minimize the amount of data sentback while maximizing it’s utility. In consultation withthe IRB at UC Berkeley, we developed a protocol thatstrikes a balance and only collects data needed for thestudies at hand without uploading any personal infor-mation. This precludes certain types of detailed or lon-gitudinal studies, which may be possible with futurecoordination with the IRB.Additionally, we implement informed consent andopt-in in Haystack. First, Haystack must be explic-itly installed by the user and granted permission toobserve traffic. Second, we require users to opt-in asecond time before we analyze encrypted traffic as de-scribed in §
4. SYSTEM DESIGN
To intercept and analyze traffic on resource-constrained devices in user space, we must address sev-eral design challenges. A key issue is that the tun in-terface exposes raw IP packets to Haystack. A naturalway to deal with these would be to shuttle a copy to our3 efault Gateway raw_packet Java Sockets
Apps flow udp flow tcp
Internet
SSLSockets
Forwarder tun
Traffic Analyzer (TA)
Off-path analysis
TLS Proxy
Non-encrypted TrafficEncrypted TrafficOff-path channels
Figure 1:
The Haystack architecture, highlighting systemcomponents and data forwarding channels. Solid lines rep-resent the actual forwarding path for traffic generated bymobile apps even if encrypted (which is handled by our op-tional TLS proxy), while dashed lines represent the off-linepath used for privacy and performance analysis.
Sleep IF sleep < IDLE_SLEEP tun read nio read IF packets_read == MAX_READtun or nothing to read from tun IF idle_count = MAX_IDLE_CYCLES IF sleep = IDLE_SLEEP IF packets_read < MAX_READnioandnio read succeeded IF packets_read < MAX_READtunandtun read succeeded IF packets_read == MAX_READnioornothing to read from nio Figure 2:
Haystack’s Forwarder state machine. It controlsread/write operations and transitions between tun interface,Java NIO socket, and sleep states. The idle count variableincrements when both tun and NIO do not succeed, i.e., there is nothing to read. Each read operation from the tun interface potentially becomes a write operation for a NIOsocket and vice versa. analysis engine and then drop the packet on the networkvia a raw socket. However, non-privileged apps do nothave access to raw sockets and therefore we must rely onregular Java sockets to communicate with remote enti-ties. This means that, as opposed to transparent L3 andL4 proxies that operate at a single layer of the protocolstack on both sides (with root privileges), Haystack hasto bridge packet-level communication on the host ( tun )side and flow-level interaction with the network side.Operating in mobile phones in user space requires care-ful design considerations to minimize Haystack’s impacton device resources, battery life, app performance anduser experience. Figure 1 illustrates the Haystack ar-chitecture, which includes two major components, the
Forwarder and the
Traffic Analyzer (TA).
The Forwarder performs two key functions: ( i ) it per-forms transparent bridging between packets on the tun interface and payload data on the regular socket inter-face and ( ii ) it forwards traffic to the TA for analysis. The Forwarder receives raw IP packets from tun . TheForwarder therefore acts like a layer 3/layer 4 networkstack: it extracts the payload from the raw packet andsends it to its intended destination through a regularJava socket (implemented using non-blocking NIO sock-ets [48]). To accomplish this, the Forwarder extractsflow state from the packet headers (IP, as well as UDPor TCP) for packets arriving on the tun interface andmaps it to a given Java socket (it creates new sock-ets for new flows arriving on the tun interface). It alsomaintains this state so that it can marshal data arrivingfrom remote hosts on the sockets back into packets fortransmission to the app via the tun interface. Haystackhas dual-stack support and its routing tables correctlyforward DNS and IPv6 traffic through the tun interfaceto prevent traffic leak [49].
Handling UDP and TCP:
The Forwarder needs tomaintain state for UDP and TCP flows. A simple flow-to-socket mapping suffices for connectionless UDP, sinceheader reconstruction remains straightforward. SinceTCP provides connection-oriented and reliable trans-port, we need to track the TCP state machine and main-tain sequence and acknowledgment numbers for eachTCP flow. We segment the data stream received fromthe socket and synthesize TCP headers to be able toforward the resulting packets to the tun interface fordelivery to the app. When we read a SYN packet fromthe tun interface, we create a new socket, connect tothe target and instantiate state in Haystack. After theOS establishes the socket we return a SYN/ACK viathe tun interface. We similarly relay connection ter-mination. We discuss Haystack’s lack of support fornon-TCP/UDP traffic in § Efficient packet forwarding:
The Forwarder mustbalance application and traffic performance with powerand CPU usage on the device. This task is challengingbecause the tun interface does not expose an event-based API. We therefore implement a polling schemethat periodically checks both the tun interface and Javasockets for arriving data.Figure 2 shows the state machine of the Forwarder. Itreads up to max read tun packets from the tun interfaceor up to max read nio packets from the socket (NIO) interface before switching to the other interface, hencepreventing either operation from starving. The For-warder immediately transitions to the other read stateif it cannot read data in the current state. Each readfrom the tun interface potentially becomes a write op-eration for a socket and vice versa, the exception beingpure TCP ACKs from the tun interface. We discardthese, as their effect gets abstracted by the socket in- Despite the inability to count packets from socketread/write operations, we count the number of packets gen-erated and sent back through the tun interface. tun interface complete quickly and socketwrites do not block, so we perform writes as soon as wehave data to send. If it cannot read data from eitherinterface for max idle cycles consecutive iterations, theForwarder will sleep for idle sleep ms. While this strat-egy reduces power consumption during idle periods, italso imposes higher latency on packets that arrive dur-ing these idle periods when polling happens at coarseintervals. We consider the tradeoff between resourceconservation and performance in depth in § Many mobile applications have adopted TLS as thedefault cryptographic protocol for data communica-tions. This is a double-edged sword, as it helps protectthe integrity and privacy of users’ transactions but alsoallows apps to conceal their network activity. With theuser’s consent, Haystack employs a transparent man-in-the-middle (MITM) proxy for TLS traffic [20]. Atinstall time Haystack requests the user allow the instal-lation of a self-signed Haystack CA certificate in theuser CA certificate store. We customize the messageshown to users at this time to explain why Haystackintercepts encrypted traffic.Once equipped with a certificate, the Forwarder mon-itors TCP streams beginning with a TLS “Client Hello”message and forwards these flows—along with flow-levelmeta-information the proxy requires in order to con-nect to the server ( e.g.,
IP address, port, SNI)—to theTLS proxy. The proxy uses this information to connectto the remote host and reports back to the Forwarderwhether the connection was successful. After success-fully establishing a connection to the remote host, theproxy decrypts traffic arriving on one interface ( tun orsocket) and re-encrypts it for relay to the other whileproviding a clear-text version to the TA for analysis.
Dealing with failed TLS connections:
As in anycommercial TLS proxy, Haystack will be unable toproxy flows when the client application ( i ) uses TLS ex-tensions not supported by Haystack [19], ( ii ) bundlesits own trust store, or ( iii ) implements certificate pin-ning. Likewise, failure occurs when the server expects tosee certain TLS extensions not supported by Haystackin the “Client Hello” message or performs certificate-based client authentication. We add connections withfailed TLS handshakes to a whitelist that bypasses theTLS proxy for a period of five minutes. Experiencewith our initial set of users indicates that apps recovergracefully from TLS failures. After five minutes we re-move the app from the whitelist to account for transientfailures in the handshake process. While we cannot de-crypt such flows, we can still record which apps take Currently, Haystack only supports the SNI extension. these security measures and potentially communicatemore securely for further analysis.
Security considerations:
Android provides supportfor third-party root certificate installation. This isa feature required by enterprise networks to performlegitimate TLS interception. For increased security,Haystack generates a unique certificate and key-pair foreach new installation of the app. Additionally, Haystacksaves the private key to its private storage to preventother applications from accessing it. While these pre-cautions still permit malicious applications with rootaccess to retrieve the key, such apps can already tap intothe user’s encrypted traffic without using Haystack’sCA certificate ( e.g., by surreptitiously injecting theirown CA certificate into the system’s trust store).
The Traffic Analyzer (TA) processes flow data cap-tured by the Forwarder. The TA operates in near real-time but off-path, i.e., outside the forwarding path ofnetwork traffic. The TA augments flows with contextualinformation gathered from the OS for further analysis.The analyses are protocol-agnostic, and TA supportsprotocol parsers to parse flow contents before they areanalyzed. We currently support TLS, HTTP, and DNSprotocol parsers to analyze the traffic and extract rel-evant information, decompressing and decoding com-pressed and encoded data before they are searched bythe DPI module for private information leakages. Newprotocol parsers can be added to TA in case we see newprotocols getting widely adopted.
Why off-path analysis?
The TA could potentiallynegatively affect the user experience if done in the for-warding path. Analysis of network traffic can rangefrom simple ( e.g., tracking packet statistics) to quitecomplex ( e.g., parsing protocol content) and thereforecan consume valuable CPU cycles and if conducted aspart of traffic forwarding could increase latency. How-ever, as we will discuss in §
6, certain aspects of mobileapps and networks must be measured in-path as in thecase of traffic performance analysis.
Secure and efficient IPC in Android:
Unfortu-nately, low-latency communication between Androidservices can prove tricky to realize, especially in multi-threaded systems. In our implementation we use Java’sthread-safe queues for communication between the For-warder and TA modules. This allows the modules tocommunicate without exposing their data to other (ma-licious) apps as would be the case if the file system orlocalhost sockets were used [13]. In § Application and entity mapping:
One of the ba-5ic functions the TA provides is to map TCP and UDPflows to the corresponding apps. We do this via a two-step process: ( i ) extract the PID of the process thatgenerated the flow from the system’s proc directory,( ii ) map the PID to an app name using Android’s Pack-age Manager API. Compared to network-based studieswhich rely on inferences— e.g., using the HTTP User-Agent or destination IP address—to couple apps andflows [52, 53, 62], our approach allows highly accurateflow-to-app mappings. Since reading the PID and map-ping applications requires file-system access, we cacherecently read results to minimize overhead.The TA also provides the ability to analyze protocolsin depth. For example, the TA tracks DNS transac-tions to extract names associated with IP addresses,allowing us to map flows to target domains rather thanjust IP addresses. This is especially important for non-HTTP flows ( e.g., QUIC, HTTPS) where the hostnamemay not be readily available in application layer head-ers. Mapping IPs to their hostnames gives us the op-portunity to distinguish apps sending data to their ownbackend as opposed to third-party ad/analytics servicesor CDNs, even if both reside in the same cloud ser-vice provider [18]. Further, the TA can perform traf-fic characterization based on domain, without analyz-ing application-layer headers ( e.g.,
HTTP
Host header).We demonstrate how these capabilities in the TA canenable studies like per-app protocol usage and usertracking detection in § Protocol support:
Android limits us to only TCPand UDP sockets via Java’s APIs, thus excluding pro-tocols such as ICMP. As of today, this limitation onlyseems to affect a small number of network troubleshoot-ing tools. The Forwarder provides IPv6 support, exceptfor extended headers. We have not noticed any issuesfor IPv6 flows due to this limitation.
Recovery from loss of connectivity:
The VPN ser-vice (and therefore the Forwarder) gets disrupted whenusers roam between different networks such as 3G andWiFi or different WiFi networks, or when a network dis-connection occurs. Haystack identifies such events andattempts to reconnect seamlessly. Similarly, phone callsdisable all data network interfaces, thus stopping theVPN service. While currently this disables Haystack,we are working on using Android APIs to identify whenthe calls complete to transparently restart the VPN.
Vendor-custom firmware:
Many device vendorsblock and interfere with standard Android APIs. Onecase is Samsung’s KNOX SDK—only available forSamsung licensed partners—which prevents third-partyVPN applications from creating virtual interfaces [54].Likewise, some vendor-locked firmwares also prevent Haystack from intercepting TLS traffic by blocking CAcertificate installation. We have thus far primarily en-countered this issue on Samsung phones.
DPI and arms race:
Malicious agents will alwayshave an incentive to not being identified. Against ourbest efforts to parse and extract information from popu-lar protocols, inflate compressed streams, and interceptconventional TLS-encrypted flows; as well as Haystack’sability to support newer protocol ( e.g.,
QUIC and newTLS extensions) as mobile apps and the mobile ecosys-tem as a whole evolve, some apps will still be able toexfiltrate private information through obfuscation andencryption schemes that are not supported by Haystack.Since Haystack would fall short of studying these in-stances, we acknowledge that there is a possibility of anarms race between privacy-invasive and malicious appsand approaches like Haystack.
5. PERFORMANCE EVALUATION
We have implemented Haystack as a user-level An-droid app per the design given above. Our implemen-tation leverages a number of external libraries for taskssuch as efficient packet parsing [2], IP geo-location [5],data presentation [6], and TLS interception [20]. TheHaystack codebase—excluding the external librariesand XML GUI layouts—spans 15,000 lines of code. Inthis section we evaluate to what extent we achieve ourgoal of real-time monitoring without burdening the de-vice’s resources in practice and under stress conditions.
To evaluate Haystack performance in a controlled set-ting, we set up a testbed with a Nexus 5 phone con-nected to a dedicated wireless access point over a 5 GHz802.11n link. We also connected a small server to theaccess point via a gigabit Ethernet link. We minimizebackground traffic on the phone by only including theminimal set of pre-installed apps and not signing intoGoogle Services. We measure the latency of Haystackusing simple UDP and TCP echo packets. For non-TLSthroughput tests, we use a custom-built speed-test thatopens three parallel TCP connections to the server for15 seconds in order to saturate the link. We test uplinkand downlink separately. For profiling TLS establish-ment latency and downlink speed-test, we cannot useour speed-test, as it does not employ a TLS session. In-stead, we download 1 B and 20 MB objects over HTTPSfrom an Apache v2 web server with a self-signed x.509certificate. We repeat each test 25 times.While our testbed allows us to explore many param-eters within the design space, Android’s VPN securitymodel precludes full automation of our experiments asit requires user interaction to enable/disable the tun interface. We focus on the impact of max idle cycles and idle sleep and fix max read tun and max read nio to6 idle_sleep (ms) C P U ( % ) max_idle_cycles Figure 3:
Haystack’s CPU overhead for different max idle cycles and idle sleep configurations. The horizon-tal line indicates the aggregated average CPU load of allapps running on the background for reference.
100 packets which favors downlink traffic.
CPU usage impacts interactivity of foreground appsand as a result, the user experience. We thereforeinvestigate the impact of how idle a device must bebefore starting periodic polling (for a maximum of max idle cycles cycles) and how often we poll for newtraffic after a device is deemed idle ( idle sleep ms) onCPU load and battery life.
CPU load:
Mobile phones remain idle most of thetime [17, 61]. As a result, optimizing Haystack’s per-formance in this scenario is essential to minimize itsimpact on limited system resources, in particular onbattery life. The base CPU usage of the Nexus 5 is2% in the absence of Haystack, when the system isidle with its screen off and normal background activ-ity from installed apps. Figure 3 shows the impact of max idle cycles and idle sleep on CPU usage when en-abling Haystack. We find that idle sleep has the mostsignificant effect on CPU load, which is unsurprisingas this parameter dictates how long the app sleeps andtherefore does not consume CPU. With idle sleep set to1 ms, the CPU load varies between 45% and 55% fordifferent values of max idle cycles with the Forwarderpolling the interfaces at a high frequency. CPU usagedrops sharply as we increase idle sleep , to 10.5% and4.6% with idle sleep at 10 ms and 25 ms, respectively.In contrast to idle sleep , max idle cycles shows little in-fluence on CPU overhead, particularly at idle sleep val-ues greater than 10 ms. This is because we measure max idle cycles in loop cycles (cf. Figure 2) which takea small fraction of 1 ms each. For idle sleep of 100 msand max idle cycles of 10 or 100 cycles the overhead ofHaystack is negligible, with the CPU usage close to thebase CPU usage (horizontal line in Figure 3). We con-sider an idle sleep value of 100 ms ideal for operatingduring idle periods (delay-tolerant) and an idle sleep value of 10 ms during interactive periods. In the follow-ing subsections, we will evaluate the impact of idle sleep in traffic performance. Test Case Power(mW) Mean/SD Increase
Idle 1,089.6 / 125.9 +3.1%Idle (Haystack) 1,123.8 / 150.4YouTube 1,755.3 / 35.5 +9.1%YouTube (Haystack) 1,914.4 / 16.1
Table 2:
Power consumption of Haystack when max idle cycles is 100 cycles and idle sleep is 1 ms in differ-ent scenarios. The percentage indicates the increase whenrunning Haystack. packet t proc t buff packet packet idle_cycles t proc Time idle_sleep (ms) cpu active cpu inactive idle_sleep (ms) outgoing packet incoming packet t proc t buff Packet Buffering Time Packet Processing Time
Figure 4:
Latency added by idle sleep and max idle cycles on packets arriving during periods of activity, or inactivity.
User experience during interactive periods:
Wenext profile Haystack’s overhead under heavy load. Todo so we run Haystack and stream a 1080p YouTubevideo. This stresses packet forwarding, CPU usage, andthe TLS Proxy, since YouTube delivers the video overTLS. Crucially, we do not observe delay, rebufferingevents, or noticeable change in resolution during thevideo replay, suggesting that Haystack’s performancecan keep up with demanding applications.
Power consumption:
We use the Monsoon PowerMonitor [46] to directly measure the power consumedby Haystack on a BLU Studio X Plus phone runningAndroid 5.0.2. We removed the battery and replacedit with the power meter set to emulate the phone’sstandard 3.8V battery. We then record the powerconsumed during various situations. Table 2 summa-rizes the results for each scenario across 10 trials with max idle cycles set to 100 cycles and idle sleep set to1 ms. This configuration represents the worst-case (cf.Figure 3) as Haystack sleeps for only 1 ms before pollingthe interfaces again. Unfortunately, due to hardwarelimitations, we could not measure Haystack’s powerconsumption with the screen off but, for that scenario,we can use Haystack’s CPU overhead as a proxy [61].During idle periods with the screen active, Haystack in-creases power consumption by 3% (similar to the CPUincrease). The overhead of Haystack increases to 9%while streaming a YouTube video.
Haystack suspends polling during periods of inactiv-ity to conserve battery. However, suspending polling We faced several instrumentation challenges that impededmeasuring Haystack’s power consumption on a Nexus 5. idle_sleep (ms) UD P R TT ( m s ) max_idle_cycles (a) UDP latency. idle_sleep (ms) T C P C onne c t i on T i m e ( m s ) max_idle_cycles (b) TCP connection time. T h r oughpu t ( M bp s ) idle_sleep (c) TCP Throughput. Figure 5:
Haystack performance (UDP latency, TCP connection time, and TCP throughput) for different idle sleep and max idle cycles configurations. For the throughput evaluation, we fix max idle cycles to 100 cycles, also showing the impactof enabling the TA. The maximum TCP throughput for this link is 73 and 83 Mbps uplink and downlink, respectively. also increases latency for packets that arrive during loopsuspension, as illustrated in Figure 4. In the figure,the packet that arrives during the first idle sleep pe-riod endures the remainder of the idle period ( t buff ),in addition to the forwarding time ( t proc ), which in-cludes looking up relevant header state and translatingbetween the layer 3 tun interface and layer 4 NIO sock-ets. However, the packet that arrives when polling isactive does not experience the idle period overhead.We now analyze the latency incurred by packets whenrunning Haystack. Specifically, we focus on the im-pact of max idle cycles and idle sleep and the trade-off between latency and CPU overhead. Figure 5(a)shows the results of our experiments for UDP. When max idle cycles =1 cycle the latency closely follows idle sleep because Haystack’s aggressive sleeping ren-ders it more likely for packets to arrive when the systemis idle, therefore delaying them for up to idle sleep msbefore being processed. With max idle cycles =100 cy-cles and idle sleep =100 ms we find about 60 ms ofextra delay. Reducing idle sleep to 10 ms while keep-ing max idle cycles at 100 cycles reduces latency to aslow as 3.4 ms. We find similar patterns for TCP con-nections. In Figure 5(b) we plot connection establish-ment times for the TCP echo client and server. As ex-pected, high values of max idle cycles and coupled withlow idle sleep settings results in quicker connection es-tablishment. In fact, when the RTT of the link dropsbelow the time it takes to reach max idle cycles cycles,Haystack processes all packets in the TCP handshakewithout the Forwarder going into idle state.Finally, we consider the latency incurred by a packetduring processing and forwarding ( t proc ). To get a senseof how the latter affects performance, we evaluate t proc while running our speed-test app. Table 3 shows the re-sults of our speed-test for TCP and UDP connections.Processing times for established flows are 141 µ s forTCP and 76 µ s for UDP, indicating that the packet for-warding is not a bottleneck for Haystack’s performance.The processing times for new connections prove larger,especially for TCP, because of the overhead of initiatingstate for the connection. Downlink Uplink New Flow
TCP t proc ( µ s) 141.6 ± ± ± t proc ( µ s) 76.6 ± ± ± Table 3:
Mean processing time ( t proc ) and standard error ofmean (SEM) for Haystack’s forwarding operations for TCPand UDP flows under stress conditions. The first packet ofa new flow requires a higher processing time. We now investigate the maximum throughput thesystem can achieve. We use our speed-test app to mea-sure the throughput for non-TLS TCP and UDP flowswith idle sleep = { ms, ms } and max idle cycles =100 cycles. This setting provides us with a good com-promise between CPU usage and latency.Figure 5(c) shows the maximum throughput achievedby Haystack’s Forwarder. We find that Haystack canprovide up to 17.2 and 54.9 Mbps uplink and downlinkthroughput, respectively. As expected, when idle sleep increases the throughput decreases, as more packets ar-rive with the Forwarder in idle state, thus incurring t buffer (cf. Figure 4). Haystack also has a bias to-wards downstream traffic, which stems from two factors.First, as we discuss in § tun interface reads only a single packet at a time. Sec-ond, the operations required for upstream packets t proc are more computationally expensive (see Table 3). Weplan to investigate how we can adapt the max read nio and max read tun parameters to achieve more balancedthroughput in future work. We note that the perfor-mance we report is still in excess of what is required formodern mobile apps.Although the TA operates off-path, the use of thread-safe queues to enable communication between the For-warder and the TA and its CPU intensive operationscan inflict significant overhead on traffic throughput.As an example analysis task we consider string match-ing using the Aho-Corasick algorithm [9] on the traf-fic to detect tracking. Figure 5(c) shows TA’s impacton throughput when performing CPU-intensive stringmatching on each flow. In the worst case, Haystack pro-8 T L S E s t ab li s h m en t T i m e ( m s ) idle_sleep
10 100 n/a (a) TLS session establishment time. T L S D o w n li n k T h r oughpu t ( M bp s ) (b) TLS download speeds. Figure 6:
Session establishment and throughput for TLSwith Haystack for different idle sleep configurations, alsoshowing the impact of enabling both the TLS proxy andthe TA. We fix max idle cycles to 100 cycles. vides 23.3 Mbps downstream and 10.5 Mbps upstreamthroughput. Even when stress-testing Haystack withour speed-test, the maximum queuing time endured bypackets before the string parsing engine processes themdoes not exceed 650 ms. This worst-case scenario ariseswhen the queues contain a backlog of at least 1,000packets. Even under such circumstances, the total pro-cessing time remains low enough to provide feedbackto users about their traffic in less than a second ( e.g., exfiltrated private information).There remains significant potential for improving theoverhead imposed by the TA. In particular, we plan toinvestigate better means of communicating between theForwarder and the TA ( e.g., via Android’s IPC [12]) tomake it more efficient than the thread-safe queue wecurrently employ.
We next turn to the overhead of dealing with en-crypted communication. Figure 6 summarizes the over-head of the TLS proxy for different configurations. Wefirst consider the baseline overhead of Haystack withoutthe TLS proxy enabled on TLS connection establish-ment times, as shown in Figure 6(a). With an idle sleep of 10 ms the TLS connection establishment time is218 ms. Increasing idle sleep to 100 ms has a largeeffect, doubling the TLS establishment time (466 ms).Using the TLS proxy further increases establishmenttime to 653 ms with idle sleep at 10 ms, and 503 mswith idle sleep at 100 ms. We next assess the overhead of the TLS proxy onthroughput, as shown in Figure 6(b). Compared to notrunning Haystack at all, the overhead of the TLS proxyis 26% and 29% for idle sleep = 10 ms and 100 ms, re-spectively. Despite the decrease in throughput, overallthroughput with the TLS proxy is still 26 Mbps, which(as discussed in § idle sleep has little impact on throughput since subse-quent packets bring the Forwarder out of the idle state,thus avoiding t buffer for the bulk of the transfer. Thefact that the TLS proxy reassembles the streams for the idle sleep also helps reduce the overhead. Haystack’s Forwarder parameters can affectHaystack’s ability to accurately measure networkperformance. This section compares Haystack’s abilityto assess traffic and network performance with tcpdump packet-level timestamps on a rooted phone. For theseexperiments, we instrument a rooted mobile devicewith an Android app that performs 500 UDP-basedDNS queries to 8.8.8.8 for [nonce].stonybrook.edu .The nonce ensures that all queries bypass any in-termediate cache. We perform the DNS lookups intwo different settings: ( i ) when the DNS traffic goesdirectly through the default gateway, and ( ii ) whenHaystack forwards the DNS traffic. This allows us tocalibrate Haystack by comparing actual performanceas seen by user-space apps with passive measurementsas seen by Haystack.We use idle sleep =0 ms and max idle cycles =200 cy-cles so that we can minimize packet wait time andto prevent blocking on a given interface at the ex-penses of increasing the CPU load. We send thequeries sequentially and with random inter-query de-lays of 250 ms + rand (0 , ms , over a stable, well-provisioned WiFi link. The random delay ensures thatpackets are not queued and that we are sampling thetimes in different polling states of Haystack (recall Fig-ure 4). We factor out transient effects in the network bycomputing the difference between measurements madeby Haystack, those made by the Android app, and thoseobtained via tcpdump . Figure 7 shows the differencebetween our user-level measurements and the reference tcpdump measurements. Table 4 summarizes these dif-ferences. The difference between Haystack’s observa-tion of DNS latency and the Android app is small, withmean and median values differing by less than 50 µ s. Wefind similar results over a cellular link, which we expectbecause the measured overheads stem from the Javavirtual machine and Haystack, not from varying net-work conditions. The magnitude of the differences weobserve remains orders of magnitude smaller than typ-ical mobile network delays, making Haystack suitable9
00 1000 2000 5000 100000.000.250.500.751.00
App − tcpdumpHaystack − tcpdumpTime (microseconds) CD F Figure 7:
Difference between DNS lookup times as mea-sured by a Java-based application (red line) and by Haystack(blue line), both compared to tcpdump . The cross betweenthe red line and the blue line is likely due to instabilitiesin measuring from the application that is introduced by theJava VM on Android. The analysis confirms Haystack as avalid user space performance measurement platform.
Mean Median StDev
Haystack-tcpdump 1 , µ s 1 , µ s 303 µ sApp-tcpdump 1 , µ s 1 , µ s 658 µ s Table 4:
Detail statistics of the distribution shown in Fig-ure 7. for fine-grained network performance measurements.
Above we demonstrate the tradeoffs between re-source usage and performance of Haystack as controlledby varying max idle cycles and idle sleep parameters.While we find no sweet spot that is ideal in all circum-stances, our observations point to three possible ways toadapt Haystack’s operation to the current device state.
Adapting parameters:
Our first technique for re-ducing Haystack’s overhead is adapting the parame-ters to different usage scenarios to strike a balancefor the current device state. We consider two scenar-ios: ( i ) when the user interacts with the device orin the presence of latency-sensitive background traf-fic ( e.g., streaming audio), and ( ii ) when the phoneis idle with minimal network activity ( e.g., push no-tifications), a critical scenario as phones remain idlethe majority of time [17, 61]. In the first case, min-imizing latency and guaranteeing the best user expe-rience is critical (“performance” mode), whereas inthe second case, traffic is delay tolerant (“low-power”mode) [17]. Based on our results we determine that idle sleep = 10 ms and max idle cycles = 100 cyclesgives the best tradeoff of performance and resource us-age for delay-sensitive usage and idle sleep = 100 msand max idle cycles = 100 cycles gives the best tradeoffduring delay-insensitive usage. Table 5 summarizes theoverheads and performance for each of these settings. Sampling:
A second way to reduce the overhead ofHaystack is to sample a fraction of the connections orapplication sessions to fully analyze. This has the usualcosts and benefits of sampling: lower resource usage
Users Apps Total Flows Domains
450 1,340 942,836 8,710
Table 6:
Summary and scale of our user study. ( e.g., by not requiring the TA to do as much work) onthe one hand and less coverage on the other hand. Webelieve that users likely care most about the apps andnetworks they use regularly and therefore while sam-pling may take longer to uncover issues, these issueswill in fact be uncovered due to high usage. Targeting:
A final technique to reduce the overheadof Haystack is to allow users to instruct Haystack toonly consider certain apps. For instance, this may beuseful when the user installs a new app and wants tounderstand what information it may leak.
6. ADVANTAGES OF HAYSTACK
As we illustrate in §
2, the research community hasfocused much attention on understanding mobile de-vices and networks. These previous efforts have pro-duced many significant insights. We now turn to dis-cussing Haystack’s advantages—for both researchersand users—relative to the state of the art in mobilemeasurement and app profiling. We describe Haystackcapabilities, as well as early results culled from data col-lected by the 450 users who have installed Haystack todate. We summarize our initial dataset in Table 6. Westress that these are not full-fledged measurement stud-ies and are presented for illustrative purposes. Finally,we note that while we discuss Haystack’s capabilitiesin isolation they can often be used together to an evengreater effect.
Unprecedented View of Encrypted Traffic:
Haystack’s TLS proxy allows analysis—with the user’spermission—of encrypted communication. This infor-mation remains opaque to other methodologies (e.g.,ISP network traces) or requires trusting a third-partymiddleman to decrypt and correctly re-encrypt traf-fic while protecting the clear-text version (e.g., remoteVPN endpoints). This visibility is significant as we findthat in our dataset 22% of the flows are encrypted andless than 20% of apps send all their traffic in the clear.Therefore, gaining an understanding of the informationflow within the mobile ecosystem critically depends onbeing able to cope with encrypted traffic.
Unprecedented View of Local Traffic:
Haystackcan naturally observe local network traffic that nevertraverses the wide-area network. This traffic does notappear in ISP network traces [27,62] and methodologiesthat rely on remote VPN tunnels [52]. This capabilitywill only increase in importance given the emergenceof Internet-Of-Things (IoT) devices in the household,typically using mobile devices for control. Our datasetincludes 40 apps that generate local traffic, ranging from10 ode idle sleep max idle cycles Mean UDPRTT (ms) Mean TCPConn. Time(ms) Mean TLSConn. Time(ms) Max. Throughput[Up/Down] (Mbps) MeanCPU (%)
Performance 10 100 5.4 24.8 313.1 17.2/54.9 11.2Low-power 100 100 60.8 65.3 505.3 16.7/48.2 2.7
Table 5:
Summary of Haystack’s performance for each operational mode in a 5 GHz WiFi link with 3 ms RTT. baby monitors to media servers to smart TV remotecontrol apps.
Representative View of Apps:
When trying to un-derstand app behavior a natural first question concernsfinding a set of apps to study. Haystack answers thisquestion quite simply by considering the natural set ofapps each user executes. For other methodologies—such as static and dynamic analysis—the set of appsconsidered often results from a crawl of the Google Playstore. However, this neglects built-in apps and appsfrom non-standard or alternative app stores [23], as wellas behavioral changes caused by new app updates. Fur-ther, such studies frequently exclude non-free apps orcode paths that stem from an in-app purchase [23, 24].Haystack includes all of these aspects naturally.Our initial data suggests analyzing these apps is im-portant. We find 15% of the apps we observe did notcome from the Google Play store and include ( i ) appsdeveloped by large and small device vendors and mobilecarriers, ( ii ) pre-installed third-party apps (e.g., Kineto,a Wifi calling app [4]) and ( iii ) apps downloaded fromalternate or regional app stores [50]. These apps create22% of the traffic we observe. Further, we find that3.7% of the apps in Google Play are not free and 29%of the apps include in-app purchases. Both of theserepresent code paths often skipped when studying appbehavior, but which Haystack considers as a matter ofcourse. Finally, we find that apps not originating fromthe Google Play store do in fact leak personal informa-tion and unique identifiers ( e.g., to third-party trackingservices such as Crashlytics). Representative Code Paths:
Related to the lastpoint is that Haystack naturally deals with common andesoteric code paths. Dynamic analysis requires manualnavigation within apps, oftentimes synthetically gener-ated via UI Monkeys [66]. The results in turn provesensitive to details of this test navigation. Haystackdoes not suffer from this problem because the interac-tions observed reflect the natural way that the user in-teracts with the app. Therefore, even though Haystackwill never know that some unused code path is nefari-ous, it does not matter to that user because that usernever invokes the problematic case. On the contrary,a dynamic analysis test case might miss some buriedfeature that gets exercised by real users—in which caseHaystack will catch networking activity resulting fromthat code path.
User Involvement:
Haystack’s position on user de- vices provides the capability to engage users in novelways that benefit both research into the mobile ecosys-tem, but also the users themselves. For instance,users could be given the option to opt-in to assist-ing researchers assessing Quality-of-Experience (QoE)of their normal traffic. Combining qualitative userfeedback with quantitative measurements of the traf-fic could be a powerful combination that brings manyinsights to this space. Another case where direct in-teraction helps the user is in understanding the leakageof personal information from the phone. By exposingthis information to users they can make more informeddecisions about what apps to use or what permissionsto grant to specific apps.
Novel Policy Enforcement:
Haystack’s position be-tween apps and the network provides for a unique abil-ity to implement policy before traffic leaves the mo-bile device. While Haystack can certainly enforce tra-ditional network policies— e.g., blocking access to spe-cific IP addresses or hostnames, or rate-limiting certaintraffic—the insight into app content means the policiescan be more semantically rich ( e.g., blocking based onattempted leaking of a specific piece of information likethe IMEI to an untrusted online service). Further, poli-cies can be richer than simple blocking decisions. Forinstance, specific traffic could be sent through a VPNtunnel or an anonymization network. Or, particularsensitive elements of the contents could be obfuscated( e.g., providing a random IMEI to each app to hindercross-app tracking).
Enabling Reactive Measurement:
Haystack isnot beholden to naturally occurring traffic, butcan trigger its own active measurements as needed.These measurements—taken from the device’s naturalperspective—could be proactive in an attempt to bet-ter understand the current network context. Alterna-tively, active measurements could be taken reactively[10] based on some observation— e.g., to diagnose slowtransfers or delay spikes. Finally, active measurementscould explore “what if” sorts of analysis that could inturn be used to tune the device’s operation for bet-ter performance. For instance, Haystack could explorewhether an alternate DNS resolver would yield faster orbetter answers compared with the standard resolver.
Explicitness Leading To Precision:
Many of theproblems in measuring the mobile ecosystem stem fromthe need to infer or estimate various aspects of the com-munication. Haystack gets around many of these prob-11 .000.250.500.751.00 0 10 20
CCD F Service type
Ad networkAllAnalytic serviceOther
Figure 8:
Distribution of the number of third-party ser-vices per app across our dataset. lems because it operates within explicit context fromthe mobile operating system. For instance, instead ofinferring a device’s phone number, Haystack has a di-rect understanding of the value and therefore can di-rectly hunt for it in the traffic instead of searching forsomething that “looks like” a phone number and thentrying to determine if that is in fact the device’s phonenumber or not.As another example, Haystack is able to directly at-tribute traffic flows to apps rather than using heuristicsor inferences. Within our dataset we use this abilityto detect apps that use third-party services—for myr-iad reasons, including ad delivery, analytics, alternativepush notifications [3]—at the network level. While asimple app-agnostic count of accesses to specific ser-vices provides some understanding of popular services,this method leaves ambiguous whether the popularitystems from broad use across apps or simply use by pop-ular apps. Haystack can directly answer this question.Figure 8 shows the distribution of the number of third-party services used per app across our dataset. By rank-ing the online services by the number of apps connectingto them, we can see that Crashlytics and Flurry are themost common third-party services across our corpus ofmobile apps.
Large-Scale Deployment:
Haystack’s operation asa normal user-level app that does not require rootinga device or a custom firmware version means the bar-rier to entry for using Haystack is low. This is usefulfor research purposes as we can coax more users to theplatform than a more cumbersome tool would require.Further, the platform gives us a direct path to movingbeyond research and to actually helping normal usersunderstand the operation of their devices that simplywould not be possible if users had to jump through sig-nificant hoops to install and use Haystack. These twoaspects feed a virtuous circle: more users provide moredata that we can leverage to increase users’ understand-ing, which in turn provides a larger incentive to enticeadditional users.While each of the above are advantages in their ownright, they become even more powerful when combined.For instance, Haystack has the power to understandthat a specific app is trying to leak their phone number to some IoT device in their house within an encryptedconnection. Further, this understanding could be com-municated to the user, as well as serving as fodder fora policy that thwarts such activity in the future. Eachaspect of this example is either impossible with currenttechniques or at least requires inference and heuristics.
7. DISCUSSION AND FUTURE WORK In § e.g., preventing Taintdroid-like approaches),Haystack’s approach provides a promising avenue forinvestigating the iOS ecosystem.We stress that the applications of Haystack sketchedin this paper serve to exemplify its abilities as a trafficinspection platform, not as a step in the network se-curity arms race. For example, we do not advocateHaystack as a full-blown TLS inspector—rather, wedemonstrate that the platform supports developmentin this direction to a point that can readily provide in-teresting results.We are currently exploring ways to open up Haystackto the research and app developer communities. Byfurther separating the Forwarder and Traffic Analyzercomponents we can establish access to the device’s traf-fic streams for other apps, effectively providing a proxyto the absent “packet capture” permission on Android. In doing so, we can overcome an additional constraintof Android’s security model, namely that only a singleVPN app can run at any given time.We acknowledge that opening up Haystack’s capabil-ities to third-party apps would raise grave security con-cerns, as malicious apps may abuse Haystack’s capabil-ities for nefarious purposes. We defer full treatment tofuture work and here only mention that Android’s cus-tom permissions model [58] provides avenues for makingaccess to these new capabilities controllable by the user.We are planning to open-source the Haystack code-base, and will make anonymized data collected byHaystack available online via a web-based query inter- Personal communication with the Google Android teamsuggests that this option remains unlikely to ever materializedirectly in the Android OS.
8. SUMMARY
We have presented the design, implementation, andevaluation of Haystack, a multi-purpose mobile vantagepoint for Android devices built on top of Android’s VPNpermission. As Haystack runs completely in user-space,it enables large-scale measurements of real-world mobilenetwork traffic from end-user devices, with organic userand network input.Through extensive evaluation, we have demonstratedthat Haystack realizes a flexible mobile measurementplatform that can deliver sufficient performance withmodest resource overhead and minimal impact on useractivity when compared to state-of-the-art methodsthat rely on static and dynamic analysis.Haystack opens a new horizon in mobile researchby achieving an architectural sweet-spot that makes iteasy to install on regular user phones (thus enablinglarge-scale deployment and benefiting from user’s in-put) while enabling in-depth visibility into device ac-tivity and traffic (thus providing installation incentivesto the user). Using a deployment to 450 users who in-stalled the Haystack app from Google’s Play Store, wedemonstrated Haystack’s ability to provide meaningfulinsights about protocol usage, its ability to identify se-curity and privacy concerns of mobile apps, and to char-acterize mobile traffic performance.
9. REFERENCES [1] Censys. https://censys.io/ .[2] jpcap. A network packet capture library for Java. http://jpcap.sourceforge.net/ .[3] JPush. .[4] Kineto. http://kineto.com .[5] Maxmind. Geo-IP Java API. https://github.com/maxmind/geoip-api-java .[6] MPAndroidChart. https://github.com/PhilJay/MPAndroidChart .[7] Networked system ethics. .[8] The Bro Network Security Monitor. .[9]
Aho, A., and Corasick, M.
Efficient stringmatching: an aid to bibliographic search.
Communications of the ACM (1975).[10]
Allman, M., and Paxson, V.
A reactivemeasurement framework. In
PAM (2008).[11]
Anand, S., Naik, M., Harrold, M. J., and Yang,H.
Automated concolic testing of smartphone apps. In
ACM SIGSOFT (2012).[12]
Android Developer’s Documentation . BinderIPC. http://developer.android.com/reference/android/os/Binder.html .[13]
Android Developer’s Documentation . Securitytips. using interprocess communications. http://developer.android.com/training/articles/security-tips.html .[14]
Android Developer’s Documentation . VPNService. http://developer.android.com/reference/android/net/VpnService.html .[15]
Apple Developer’s Documentation . What’s newin the network extensions and VPN. https://developer.apple.com/videos/play/wwdc2015/717/ .[16]
Au, K. W. Y., Zhou, Y. F., Huang, Z., and Lie,D.
PScout: Analyzing Android permissionspecification. In
ACM CCS (2012).[17]
Aucinas, A., Vallina-Rodriguez, N.,Grunenberger, Y., Erramilli, V., Papagiannaki,K., Crowcroft, J., and Wetherall, D.
Stayingonline while mobile: The hidden costs. In
ACMCoNEXT (2013).[18]
Bermudez, I. N., Mellia, M., Munaf`o, M. M.,Keralapura, R., and Nucci, A.
DNS to the rescue:discerning content and services in a tangled web. In
ACM IMC (2012).[19]
Blake-Wilson, S., Nystrom, M., Hopwood, D.,Mikkelsen, J., and Wright, T.
Transport layersecurity (tls) extensions. Tech. rep., 2006.[20]
Boneh, D., Inguva, S., and Baker, I.
SSL Man inthe Middle Proxy. https://crypto.stanford.edu/ssl-mitm/ .[21]
Chen, T., Ullah, I., Kaafar, M. A., and Boreli,R.
Information leakage through mobile analyticsservices. In
ACM HotMobile (2014).[22]
Egele, M., Kruegel, C., Kirda, E., and Vigna,G.
PiOS: Detecting privacy leaks in iOS applications.In
NDSS (2011).[23]
Enck, W., Gilbert, P., Chun, B., Cox, L., Jung,J., McDaniel, P., and Sheth, A.
TaintDroid: AnInformation-Flow Tracking System for RealtimePrivacy Monitoring on Smartphones. In
USENIXOSDI (2010).[24]
Fahl, S., Harbach, M., Muders, T., Smith, M.,Baumg¨artner, L., and Freisleben, B.
Why Eve nd Mallory love Android: An analysis of AndroidSSL (in) security. In ACM CCS (2012).[25]
Falaki, H., Lymberopoulos, D., Mahajan, R.,Kandula, S., and Estrin, D.
A first look at trafficon smartphones. In
ACM IMC (2010).[26]
Georgiev, M., Iyengar, S., Jana, S., Anubhai,R., Boneh, D., and Shmatikov, V.
The mostdangerous code in the world: Validating ssl certificatesin non-browser software. In
ACM CCS (2012).[27]
Gill, P., Erramilli, V., Chaintreau, A.,Krishnamurthy, B., Papagiannaki, K., andRodriguez, P.
Follow the money: Understandingeconomics of online aggregation and advertising. In
ACM IMC (2013).[28]
Google Play . FCC SpeedTest. https://play.google.com/store/apps/details?id=com.samknows.fcc .[29]
Google Play . MobiPerf. https://play.google.com/store/apps/details?id=com.mobiperf .[30]
Google Play . My Speed Test. https://play.google.com/store/apps/details?id=com.num .[31]
Google Play . NameHelp. https://play.google.com/store/apps/details?id=edu.northwestern.aqualab.namehelp .[32]
Google Play . Netalyzr. https://play.google.com/store/apps/details?id=edu.berkeley.icsi.netalyzr.android .[33]
Google Play . Noroot firewall. https://play.google.com/store/apps/details?id=app.greyshirts.firewall&hl=en .[34]
Google Play . Ookla SpeedTest. https://play.google.com/store/apps/details?id=org.zwanoo.android.speedtest .[35]
Google Play . OpenSignal Maps. https://play.google.com/store/apps/details?id=com.staircase3.opensignal .[36]
Google Play . Packet Capture. https://play.google.com/store/apps/details?id=app.greyshirts.sslcapture .[37]
Google Play . tPacketCapture. https://play.google.com/store/apps/details?id=jp.co.taosoftware.android.packetcapture .[38]
Hornyack, P., Han, S., Jung, J., Schechter, S.,and Wetherall, D.
These aren’t the droids you’relooking for: Retrofitting android to protect data fromimperious applications. In
ACM CCS (2011).[39]
Huang, J., Qian, F., Mao, Z. M., Sen, S., andSpatscheck, O.
Screen-off traffic characterizationand optimization in 3G/4G networks. In
ACM IMC (2012).[40]
ICSI - Google Play . Haystack app. https://play.google.com/apps/testing/edu.berkeley.icsi.haystack .[41]
J. Sommers and Paul Barford . Cell vs. WiFi: onthe performance of metro area mobile connections. In
ACM IMC (2012).[42]
Kakhki, A. M., Razaghpanah, A., Li, A., Koo,H., Golani, R., Choffnes, D., Gill, P., andMislove, A.
Identifying Traffic Differentiation inMobile Networks.
ACM IMC (2015).[43]
Kreibich, C., Weaver, N., Nechaev, B., andPaxson, V.
Netalyzr: Illuminating the edge network.In
Proceedings of the ACM Internet MeasurementConference (IMC) (Melbourne, Australia, November2010), pp. 246–259.[44]
Le, A., Varmarken, J., Langhoff, S., Shuba, A., Gjoka, M., and Markopolou, A.
AntMonitor: ASystem for Monitoring from Mobile Devices. In
ACMC2B1D (2015).[45]
Leontiadis, I., Efstratiou, C., Picone, M., andMascolo, C.
Don’t kill my ads!: balancing privacy inan ad-supported mobile application market. In
ACMHotMobile (2012).[46] Monsoon Power Monitor. .[47]
Nikravesh, A., Yao, H., Xu, S., Choffnes, D.,and Mao, Z. M.
Mobilyzer: An open platform forcontrollable mobile network measurements. In
ACMMobiSys (2015).[48]
Oracle . New i/o apis. http://docs.oracle.com/javase/1.5.0/docs/guide/nio/index.html .[49]
Perta, V. C., Barbera, M. V., Tyson, G.,Haddadi, H., and Mei, A.
A glance through the vpnlooking glass: Ipv6 leakage and dns hijacking incommercial vpn clients.
PETS (2015).[50]
Petsas, T., Papadogiannakis, A.,Polychronakis, M., Markatos, E., andKaragiannis, T.
Rise of the planet of the apps: Asystematic study of the mobile app ecosystem. In
Proceedings of ACM IMC (2013).[51]
Qian, F., Wang, Z., Gerber, A., Mao, Z., Sen, S.,and Spatscheck, O.
Profiling resource usage formobile applications: a cross-layer approach. In
ACMMobiSys (2011).[52]
Rao, A., Sherry, J., Legout, A.,Krishnamurthy, A., Dabbous, W., andChoffnes, D.
Meddle: Middleboxes for IncreasedTransparency and Control of Mobile Traffic. In
ACMCoNEXT Student Workshop (2012).[53]
Ren, J., Rao, A., Lindorfer, M., Legout, A.,and Choffnes, D.
ReCon: Revealing andControlling PII Leaks in Mobile Network Traffic . In
ACM MobiSys (2016).[54]
Samsung . KNOX VPN SDK. https://seap.samsung.com/sdk/knox-vpn-android .[55]
Seneviratne, S., Kolamunna, H., andSeneviratne, A.
A measurement study of tracking inpaid mobile applications. In
ACM WiSec (2015).[56]
Seneviratne, S., Seneviratne, A., Kaafar, M.,Mahanti, A., and Mohapatra, P.
Early detectionof spam mobile apps. In
WWW (2015).[57]
Shafiq, Z., Ji, L., Liu, A., Pang, J.,Venkataraman, S., and Wang, J.
A first look atcellular network performance during crowded events.In
ACM SIGMETRICS (2013).[58]
Six, J.
An in-detph introduction to the androidpermission model. .[59]
Song, Y., and Hengartner, U.
Privacyguard: Avpn-based platform to detect information leakage onandroid devices. In
Proceedings of the 5th AnnualACM CCS Workshop on Security and Privacy inSmartphones and Mobile Devices (2015).[60]
Vallina-Rodriguez, N., Aucinas, A., Almeida,M., Grunenberger, Y., Papagiannaki, K., andCrowcroft, J.
RILAnalyzer: a comprehensive 3Gmonitor on your phone. In
ACM IMC (2013).[61]
Vallina-Rodriguez, N., and Crowcroft, J.
Energy management techniques in modern mobile andsets. IEEE Communications Surveys & Tutorials (2012).[62]
Vallina-Rodriguez, N., Shah, J., Finamore, A.,Grunenberger, Y., Papagiannaki, K., Haddadi,H., and Crowcroft, J.
Breaking for commercials:characterizing mobile advertising. In
ACM IMC (2012).[63]
Vallina-Rodriguez, N., Sundaresan, S.,Kreibich, C., Weaver, N., and Paxson, V.
Beyond the radio: Illuminating the higher layers ofmobile networks. In
ACM MobiSys (2015).[64]
Vigneri, L., Chandrashekar, J., Pefkianakis, I.,and Heen, O.
Taming the Android AppStore:Lightweight Characterization of Android Applications.
ArXiv e-prints (2015).[65]
Wijesekera, P., Baokar, A., Hosseini, A.,Egelman, S., Wagner, D., and Beznosov, K.
Android permissions remystified: a field study oncontextual integrity. In
USENIX Security (2015).[66]
Wong, M. Y., and Lie, D.
Intellidroid: A targetedinput generator for the dynamic analysis of androidmalware.[67]
Yang, Z., Yang, M., Zhang, Y., Gu, G., Ning, P.,and Wang, X.
AppIntent: analyzing sensitive datatransmission in android for privacy leakage detection.In
ACM CCS (2013).[68]
Zhang, Y., Yang, M., Xu, B., Yang, Z., Gu, G.,Ning, P., Wang, X., and Zang, B.
VettingUndesirable Behaviors in Android Apps withPermission Use Analysis. In
ACM CCS (2013).(2013).