[PDF] Defending Tor from Network Adversaries: A Case Study of Network Path Prediction

Abstract

The Tor anonymity network has been shown vulnerable to traffic analysis attacks by autonomous systems and Internet exchanges, which can observe different overlay hops belonging to the same circuit. We aim to determine whether network path prediction techniques provide an accurate picture of the threat from such adversaries, and whether they can be used to avoid this threat. We perform a measurement study by running traceroutes from Tor relays to destinations around the Internet. We use the data to evaluate the accuracy of the autonomous systems and Internet exchanges that are predicted to appear on the path using state-of-the-art path inference techniques; we also consider the impact that prediction errors have on Tor security, and whether it is possible to produce a useful overestimate that does not miss important threats. Finally, we evaluate the possibility of using these predictions to actively avoid AS and IX adversaries and the challenges this creates for the design of Tor.

Full PDF

DDefending Tor from Network Adversaries: A Case Study ofNetwork Path Prediction

Joshua Juen

University of Illinois atUrbana-Champaign [email protected] Aaron Johnson

Naval Research Laboratory [email protected] Anupam Das

University of Illinois atUrbana-Champaign [email protected] Borisov

University of Illinois atUrbana-Champaign [email protected] Matthew Caesar

University of Illinois atUrbana-Champaign [email protected]

ABSTRACT

The Tor anonymity network has been shown vulnerable to trafﬁcanalysis attacks by autonomous systems and Internet exchanges,which can observe different overlay hops belonging to the same cir-cuit. We aim to determine whether network path prediction tech-niques provide an accurate picture of the threat from such adver-saries, and whether they can be used to avoid this threat. We per-form a measurement study by running traceroutes from Tor relaysto destinations around the Internet. We use the data to evaluate theaccuracy of the autonomous systems and Internet exchanges thatare predicted to appear on the path using state-of-the-art path infer-ence techniques; we also consider the impact that prediction errorshave on Tor security, and whether it is possible to produce a usefuloverestimate that does not miss important threats. Finally, we eval-uate the possibility of using these predictions to actively avoid ASand IX adversaries and the challenges this creates for the design ofTor.

Keywords

Autonomous Systems; Internet Exchanges; Tor

1. INTRODUCTION

The Tor network for anonymous communication [10] is suscepti-ble to end-to-end timing attacks [25], which allow an adversarywho observes trafﬁc from a client to the ﬁrst Tor router and atthe same time trafﬁc from the last Tor router to the destination todeanonymize the connection. Both of these paths traverse a numberof Internet routers that belong to various organizations, leaving thepossibility that a single network operator running an autonomoussystem (AS) or an Internet exchange (IX) will be in the positionto observe both paths and thus carry out the end-to-end timing at-tack [11, 13, 17, 18, 23]. This is made more likely by the concentra-tion of Internet trafﬁc at Tier 1 ISPs and high-volume IXes.To assess the vulnerability of Tor to AS and IX adversaries, it is necessary to predict the paths that trafﬁc takes on the Internet. Pre-vious work characterizing this threat has relied chieﬂy on AS-levelrouting predictions [24]. Such predictions are well known to beincomplete and imprecise, producing erroneous path predictions.Our goal is to evaluate the impact of these errors on the anonymityof Tor. In particular, we are concerned with two research questions: • Are AS-level routing predictions suitable for characterizingthe threat of AS and IX adversaries in Tor? • Can AS-level routing predictions be used to construct Torpaths that avoid AS and IX adversaries (as has been sug-gested in previous work [4, 11])?To answer these questions, we performed a measurement study,collecting traceroute probes from Tor relays to obtain a more ac-curate picture of Internet paths actually used by network trafﬁc.In comparing results from traceroutes to state-of-the-art path pre-diction, we found that the prediction accuracy was notably worsethan previously measured, despite the fact that we are interestedin a simpliﬁed prediction problem looking for the set of ASes (orIXes) on a path, rather than the exact sequence. The errors includeboth extraneous ASes and IXes in the prediction that are not seenin traceroutes and, more worryingly, ASes and IXes in the tracer-outes that are missing from the prediction. It is possible to producean overestimate of the AS and IX sets by considering several of themost likely paths produced by the prediction algorithm, rather thanjust the top one. Such overestimates reduce but do not eliminatethe problem of missing ASes and IXes, at the cost of signiﬁcantlyincreasing the number of extraneous predictions.We next analyze the impact of these prediction errors on the vul-nerability of Tor to AS- and IX-level adversaries, with the help of asimulator that faithfully reconstructs Tor paths that may have beenchosen by a Tor user . We ﬁnd that AS and IX path predictionsigniﬁcantly overestimates the threat of vulnerability to such ad-versaries; at the same time, most users do run a signiﬁcant riskof compromise by an AS-level adversary as determined from thetraceroute data, whereas IX-level adversaries affect only a smallfraction of paths.We then modify our simulator to speciﬁcally avoid selecting pathsthat are vulnerable to AS or IX adversaries based on predictions,as has been previously suggested. We show that this signiﬁcantly TorPS: http://torps.github.io/ a r X i v : . [ c s . CR ] M a r imits the choice of paths and frequently results in no paths beingavailable for use while following the Tor practice of maintaining along-term ﬁxed set of entry guards into the network. This suggestsa new consideration for the already complex set of tradeoffs in thedesign of the mechanisms for selecting and updating the set of entryguards used in Tor [12]; we note that the situation is made worse bythe recent move towards using a single entry guard instead of 3 [9].On the other hand, we ﬁnd that many of these failures are a conse-quence of over-prediction, as we are often able to ﬁnd suitable non-vulnerable paths in our traceroute data set despite covering only afraction of the Tor relays. This suggests that a defense based onproactive path measurement, rather than AS path models, is likelyto be more practical and offer better security guarantees.

2. BACKGROUND2.1 Tor

Tor is a popular system for anonymous communication online. Torconsists of a network of volunteer relays that form an overlay net-work and forward trafﬁc sent by users running Tor clients. As ofFebruary 2015, it contains approximately 7 000 relays and trans-fers around 70 Gbps of data for user population estimated at over2 000 000. Tor uses onion routing [10] to achieve anonymity. A client sets upa connection to a destination by choosing a sequence of three re-lays, conventionally called guard , middle , and exit , and establish-ing a circuit through the sequence. The client encrypts a messageonce for each circuit relay (a process called onion encryption ) thensends it through the circuit, and each relay removes one layer ofencryption before forwarding. The ﬁnal relay sends unencryptedmessages to the destination. The reverse process happens for mes-sages from the destination to the client. As a result of this process,the client identity is only directly observable in trafﬁc between theclient and the guard relay, and the destination identity is only di-rectly observable in trafﬁc between the exit relay and the destina-tion.In order to be real-time and efﬁcient, Tor does not mix, pad or delaytrafﬁc. Therefore it is vulnerable to attacks based on trafﬁc analy-sis. For example, an adversary that can observe a circuit betweenthe client and guard and also between the exit and destination cancorrelate the trafﬁc patterns and deanonymize the connection [7].Thus entities that can observe parts the underlying network infras-tructure, such as Internet Service Providers or Internet Exchanges,are a serious threat to Tor. Previous work has shown that individ-ual Autonomous Systems and Internet Exchanges are in fact fre-quently in a position to break Tor’s security [11, 13, 17, 18, 23].However, almost all of this analysis uses heuristic route-inferencetechniques whose accuracy may not be satisfactory. Murdoch andZieli´nski [23] do study Tor security against IXes using traceroutesfrom Tor relays, but that analysis is from the UK only and does notconsider whether IX adversaries can be avoided during path selec-tion. Internet routing at the highest level is performed among autonomoussystems using the Border Gateway Protocol (BGP). An AS is a net-work with an opaque internal routing policy (e.g., using OSPF, IS-IS, RIP, or iBGP) that routes trafﬁc to and from other networks. https://metrics.torproject.org/ BGP is a path-vector routing protocol, that is, neighboring net-works advertise the whole AS path that they will use to send trafﬁcto a given destination. A path is advertised for an IP preﬁx and rep-resents the path used for all IP addresses sharing that preﬁx. Path-vector routing enables each AS to make complex routing decisionsbased on factors such as individual contracts with other ASes.Understanding the behavior of such complex routing policies onthe Internet is a challenging problem. Routers just propagate theroutes that they provide for a given neighbor to use, and so differ-ent Internet vantage points reveal different subsets of global routingbehavior. Sources of routing data include the Route Views Project ,which provides BGP routing information from many large ASes,and CAIDA Archipelago , which provides and analyzes traceroutedata from three teams of 17–18 monitors distributed worldwide.Gao describes how to use such data to infer Internet routes [14].Gao’s method uses heuristics to classify the observed connectionsbetween ASes by their economic relationship (viz. customer-to-provider, provider-to-customer, peer-to-peer, or sibling), and thenshortest-path valley-free routing is used to infer the route betweentwo hosts. Qiu and Gao improve the accuracy of this technique byincorporating the observed advertised BGP paths [24]. In addition,they describe how to infer a set of possible paths rather than justone. Their results show that these techniques can infer the exactcorrect AS path for 60% of evaluation ASes; furthermore, the ex-act path is found within the top 5 predicted possible paths for 83%ASes and within the top 14 paths for 86% ASes.Many links between ASes occur at Internet Exchanges. These arefacilities that provide space and infrastructure for ASes to locaterouters and establish connections. Ager et al. [3] describe how thelargest IXes may provide links among hundreds of ASes and carrypetabytes of trafﬁc per day. Augustin et al. [6] describe how IXeson Internet routes can be detected using traceroutes and an index ofknown IXes and their IP preﬁxes. They identify 44 000 peering re-lationships between ASes at IXes. Each peering between two ASesindicates that some traceroute passed directly from one AS to an-other through an IX. Discovering such links can improve the accu-racy of AS path inference techniques. However, as we will observe,it doesn’t discern among different router-level paths taken betweenthe same two ASes, which may pass through different IXes.The traceroute tool is extraordinarily useful in measuring routingbehavior on the Internet. There are many variations of the basicalgorithm [21] which provide different levels of success dependingon the trafﬁc engineering (e.g. ﬁltering and load balancing) thatoccurs en route. In addition to such problems with traceroute itself,it is not always straightforward to make inferences about Internetpaths from a traceroute. For example, Mao et al. [22] describe thedifﬁculties of inferring an AS-level path from traceroutes, whichinclude that different iterations of a single traceroute might takedifferent paths, that reported IP addresses may be from a networkinterface other than the one that actually received it, and that map-ping from IP address to AS number is non-trivial due to inaccu-rate WHOIS information. Augustin et al. [6] discuss similar issuesin inferring the presence of IXes from traceroutes. Nevertheless,traceroutes do provide a generally accurate picture of how packetsare actually routed and serve as an important comparison point toAS-level path predictions. . MAPPING NETWORK ADVERSARIES3.1 Measuring Internet Paths Our measurement study consists of running traceroutes from Torrelays to various destinations in the Internet. We use the scamper network tool, which probes multiple destinations in parallel, anduses techniques to accurately discover the Internet path traversedby packets in the presence of multi-path load balancing [5, 19].For our measurements, we extracted the set advertised destinationIP preﬁxes from the September 2013 RIB dumps from Route Views.Each relay running the measurements picks a random IP addresswithin each of the ∼ We next process the traceroutes to determine which ASes and ISPsan Internet path has traversed. First, we ﬁlter out traceroutes thatdo not successfully reach the destination. Note that because we userandomized destinations, in many cases the destination may notexist or may be down; indeed, only a small fraction (8%) of probesreaches their target. However, 49% reach the AS of the destination,as determined by the MaxMind GeoIP database [1].We further ﬁnd that 94% of the traceroutes are missing some hopsfrom the path. In some cases, we believe this is caused by routersclose to the probe source rate limiting their ICMP responses. Toaddress this, we perform route stitching , where gaps in a tracer-outes are ﬁlled by path segments observed in other traceroutes. Forexample, if we see a path “A B C D E” and another path “A B *D F,” where “*” denotes a missing hop, we can repair the secondpath by inferring that the third hop must have also been C in thiscase. To minimize inaccuracies introduced by this repair mecha-nism, we only consider path segments that originate from the samehost, and which are contained within the same batch of 64K tracer-outes, which typically occur within an hour or two of each other.We validated this approach on complete paths and found that stitch-ing would have given us the correct AS path result 96% of the time.We then compute the ASes corresponding to each IP in the pathusing the GeoIP database. Similar to Mao et al. [22], we considerthe corresponding AS path complete if the traceroute reached theAS of the destination and there are no missing hops in the path onthe boundary between ASes. For example, an AS path “AS1 AS1* AS1 AS2 AS3” is considered complete, because the missing hopis contained entirely within AS1, whereas “AS1 AS1 * AS2 AS3”is considered incomplete. Overall, 28% of the traceroutes yielda complete AS path. We discard the other traceroutes from ouranalysis. We also identify an IX as on the path if the path containsan IP address from the list of known IP addresses of IX points asoutlined in the following section. URL removed for blinded review Min Max TotalTraces 392 334 1 195 577 17 233 153IP Reached 27 254 81 999 1 183 427AS Reached 204 836 618 224 8 890 142Repaired 26 375 876 386 9 643 679Whole IP Path 0 16 694 253 058Whole AS Path 11 285 397 533 5 350 713Probed Hops 7 005 020 20 624 809 297 301 238Responsive Hops 1 594 449 7 041 819 109 046 956% Hops Responded 19 55 37Inferred Paths 10 693 220 449 4 367 097

Table 1:

Traceroute experiment statistics (28 hosts)

We are interested in comparing the AS and IX adversaries identi-ﬁed from traceroute data compared to AS and IX adversaries in-ferred from AS maps which are much easier to attain and main-tain. We predict AS paths from source to destination using Gao’salgorithm [14] to classify relationships and Qiu and Gao’s algo-rithm [24] to infer the top k paths (for k = 1 to 5). While advanceshave been made in classifying AS link relationships [20], we ﬁndthat matching route information broadcasts is more accurate thanusing graph based methods based solely on AS relationships [18].It is known that AS relationships are difﬁcult to classify especiallyat the highly interconnected core of the AS graph. Violations inthe valley-free principle from advertised routes often indicate er-roneous AS relationship classiﬁcation especially through top tierASes. Therefore Qiu and Gao’s method of prepending advertisedroutes to complete paths yields accurate results even with incor-rectly classiﬁed AS relationships at the core of the Internet. Sincethe prepended hops are almost entirely easily classiﬁed customer-to-provider hops at the bottom of the AS graph, improving the ASrelationship classiﬁcation of the top-level ASes does little to im-prove overall AS path prediction accuracy.To predict the presence of IXes, we recreate the work of Augustinet al. [6]. We scraped Packet Clearing House and the PeeringDatabase in February of 2014 creating a list of 732 Internet ex-change points and their known preﬁxes. We parsed over 200 mil-lion traceroutes from February and March 2014 collected from boththe CAIDA routed IPv4 database [2] and the iPlane project to maproughly 130 000 Internet exchange point peerings for our subse-quent test data. This number is roughly twice the number of linksfound by Augustin et al. in 2009, which is unsurprising consideringthe trend for ASes to peer at IX points. This list of source AS, des-tination AS, and IX number is used to identify potential IX pointson AS-level paths throughout our experiment. The list of IXes andpreﬁxes is used to positively identify IX points in the IP traces.

4. MEASUREMENT VERSUS INFERENCE

We ﬁrst investigate how closely ASes and IXes identiﬁed throughinference correspond to ASes and IXes measured with traceroute.We conduct analysis on 17 million traceroute measurements ob-tained from 28 Tor relay servers from January 19th to January 26th2014 as summarized in Table 1 . Our 28 servers included many of http://iplane.cs.washington.edu/ We extend sincere thanks to the volunteer Tor relay operators whoassisted us with this study.

Source Hosts (ASes Left IXes Right) M i ss i n g A S e s / I X e s k=5k=4k=3k=2k=1 Figure 1:

Avg Missing ASes and IXes Per Host

Source Hosts (ASes Left IXes Right) E x t r a A S e s / I X e s k=1k=2k=3k=4k=5 Figure 2:

Extra ASes and IXes Per Host

ASes Missing P e r c e n t ( % ) k=1k=2k=3k=4k=5 Figure 3:

CDF of Missing ASes over all Traces

IXes Missing P e r c e n t k=1k=2k=3k=4k=5 Figure 4:

CDF of Missing IXes over all Traces

Source Hosts I n t e r n e t E x c h a n g e P o i n t s TraceroutePredicted

Figure 5:

Observed Internet Exchange Points the largest Tor relays and cover a portion of the Tor network whichincludes 23% of guard node capacity and 26% of exit node capac-ity. Thus, their measurements can give us good insight into howtrafﬁc is routed in and out of Tor. Of these 17 million traces, onlyabout 1 million reached the target IP address with roughly 250 000complete paths with no missing IP hops. We ﬁnd that roughly 9million paths can have some hops ﬁlled in by using the repair tech-niques presented in section 3. We map each IP address to AS num-bers using the MaxMind GeoLite ASN database taken from Jan-uary 15th 2014 [1]. The AS-level paths are then parsed to removerouting loops, duplicate hops, and missing hops directly precededby and followed by the same AS. After processing, we obtain 5.3million complete AS-level routes.

We ﬁrst investigate the accuracy of inferring ASes between an ar-bitrary source/destination AS compared to the ASes identiﬁed inour collected traceroutes. The analysis of path prediction accuracyis conducted on traceroutes collected during January 19–26, 2014giving 5.3 million traces contain 450 000 unique AS source anddestination pairs. We divide the traceroute data into 24-hour win-dows. Routing table dumps are downloaded from each server fromthe Route Views project from the time closest to the 12th hour ofthe window. Each day window contains an average of 15 preﬁxtable dumps with between four to six gigabytes of route informa-tion broadcasts. Using Qiu and Gao’s model we predict the top k =5 paths for roughly 400 000 of these pairs with the rest failingdue to either the source or destination AS missing from the Route-Views routing tables. The 400 000 successful path inferences cover4.5 million of our 5.3 million traces with AS paths. We considerthe inference identifying the correct path if it matches the AS pathseen in the traceroute for any of the top k paths considered. Figure 3 is a CDF showing the number of ASes seen in the tracer-oute but missed by the top k predicted paths. Zero missing ASescorrespond with a correct path prediction for at least one of the k paths. Using only the top path from the prediction, yields roughly20% prediction accuracy with a decreasing return for higher lev-els of k and a maximum accuracy of 48% when considering thetop 5 paths. This accuracy is far lower than the 83% accuracy forthe top 5 paths attained by Qiu and Gao; however, their valida-tion was conducted using Route Views data as the ground truth andnot traceroute data [14]. We surmise the lower accuracy is due toa combination of known error sources both from the increase inprevalence of IX peering [3] and inherent errors in traceroute mea-surements from actual AS-level paths [27]. We ﬁnally note that theoverall accuracy of prediction is similar to our extensive analysisof accuracy against CAIDA traceroutes demonstrating that the Tornetwork would beneﬁt from improvements to path prediction in thegeneral case [18]. Given an AS path, we can identify the set of potential IXes thatcould occur on this path by considering which IXes can be used foreach AS–AS hop, as discussed in Section 3.2. Figure 4 comparesthe set of potential IXes for the top k predicted AS paths to theset to the IXes identiﬁed by IP address preﬁx in traceroutes. Onceagain, a value of zero indicates that all IXes have been identiﬁed inthe inference. We ﬁnd that the top path identiﬁes roughly 40% ofthe IXes and the top ﬁve paths identify roughly 74% of the IXes.We expect traceroutes to provide a much more accurate picture ofwhich IXes were involved on a path than AS path predictions. Apair of ASes will often have multiple peering points, dependingon the geographic location the source and destination; as a result,nly a fraction of trafﬁc between the two ASes will use a givenIX. In Figure 5, we compare the set of IXes on a traceroute to thepredicted set of IXes that could be used at each AS–AS hop in thetraceroute. We see that while most traceroutes do not traverse any IX, the AS–AS hops result in 3–8 potential IXes on average. Thisdemonstrates the limitations of using only AS-level information toinfer IXes on a path. k Top Paths

Choosing the k top paths to consider when predicting AS and IXpoints presents an important tradeoff between missing ASes/IXesversus severely overstating the number of ASes/IXes on a givenpath. Figures 1 and 2 show the missing AS/IXes and extra AS/IXesseen by the path prediction algorithms for the k top paths from k =1 to 5. For ASes, we see diminishing returns for missing ASes forlarger values of k with false positives increasing quickly for largervalues of k . Overall, the average attainable missing AS accuracy isclose to 1 for the top path decreasing down to .6 for the top 5 paths.We compare the IXes found using vulnerable AS–AS hops fromthe inferred AS paths directly to the IXes identiﬁed by preﬁx inthe traceroutes. In general, very few IX points were seen in thetraceroutes. In most hosts, IX identiﬁcation is helped very littleby increasing the top paths with the average missing IX of about.2 per hop. This is unsurprising because if there are no IX pointsin the traceroutes, then there can be no missing IXes in the infer-ence. Unfortunately, the false positives for IX points are problem-atic with linearly increasing averages ranging from 10-25 for eachof our hosts illustrating the need for better methods in identifyingIX points.For AS adversaries, a k value of 1 or 2 seems most appropriate toidentify most AS adversaries without causing too many false pos-itives. Higher values of k give lower rates of return while causinga linear increase in false positives. Identifying IX adversaries ismuch more problematic. Since the traceroutes identify very few IXadversaries to begin with, a k value of 1 appears to work well. Theinaccuracy of the method can be seen in the false positives whichalso increase linearly with k but greatly over-predict the numberof adversaries even with a k value of 1. The inaccuracy of ASand highly inaccurate IX prediction could potentially cause seriousproblems when designing a system of AS/IX independence in Tor.We analyze the effects of this inaccuracy in the following sections.

5. AS AND IX ADVERSARIES IN TOR

Errors in path prediction call into question previous work that hasused path prediction to both evaluate the security of Tor and pro-pose changes to Tor’s path selection based on path predictions. Un-derstanding the effect of the errors uncovered by our traceroutemeasurements requires taking into account the speciﬁc propertiesof Tor.We accomplish such an analysis by simulating the Tor protocol andnetwork at a high level. We use and adapt the Tor Path Simulator(TorPS) to perform Monte Carlo simulation of Tor path selectionby a single client. By using the hourly network “consensuses” andserver “descriptors” archived by CollecTor , we can recreate thestate of the Tor network over the period we run our simulations, in-cluding features such as the number, bandwidths, and addresses ofTor relays available in any given hour. We simulate “typical” user https://github.com/torps https://collector.torproject.org/ activity using the recorded volunteer trace of Johnson et al. [17],which includes user behaviors such as web search and webmail ona plausible daily schedule. Over the course of a week, this sched-ule results in 2632 streams (i.e., TCP connections over Tor), eachto one of 205 distinct IP addresses occupying 168 unique ASes, oneither port 80 or 443. Finally, we run simulations using the mostcommon client ASes as measured by Juen in Fall 2011 [18].Simulating path selection in Tor allows us to estimate which Inter-net hosts a user’s trafﬁc is likely to ﬂow over in a typical use case.Then we can use our traceroute data to determine the speciﬁc Inter-net routes that trafﬁc would take and evaluate the resulting security.Speciﬁcally, we provide new estimates for how often a Tor streamﬂows through the same AS or IX between the client and the guardand between the destination and the exit. When this happens, theAS or IX is in a position to deanonymize the client. This issue waspreviously studied only using inferred AS paths and IX sets.In addition, using this method we provide an improved evalua-tion of the repeatedly-proposed [11, 13] modiﬁcation to Tor to useAS/IX path inference to choose relays that are path independent ,that is, that result in paths for which the same AS or IX cannotobserve both the client and the destination. We modify TorPS toproduce the ﬁrst simulator for path-independent Tor (to our knowl-edge) that reproduces how path selection occurs over time, includ-ing features that have the potential to signiﬁcantly alter the effec-tiveness of the path-independence requirement, such as guard listsand circuit reuse. We apply our traceroute measurements to theresults of these simulations to evaluate the effectiveness of path in-ference as a basis for path independence in Tor. All of our Tor simulations run over the week of January 19–25,2014. When producing and analyzing these simulations, we gen-erally use the same data sources and inference algorithms as inSec. 3.2 to produce AS path inferences, AS-level IX inferences,and traceroute IX inferences. We use daily AS-path inferences con-ducted from January 19th-25th 2014 compared to the traces fromeach day of the simulation week. We also use the daily Route Viewspreﬁx-to-AS datasets to determine routed preﬁxes and to map IPsto ASes. When analyzing our simulations using traceroutes, weuse all of the traceroute measurements gathered during the weekof January 19th-25th 2014. In our analysis we match a tracerouteto a pair of communicating hosts in Tor if the source preﬁx anddestination preﬁx match.We ﬁrst conduct a simulation using the default Tor path selectionalgorithm. We consider clients coming from 50 of the top 200 mostcommon client ASes (as measured by Juen [18]). Each AS adver-tises hundreds of possible preﬁxes in the Route Views data. Weselect at random twenty preﬁxes per client AS for a total of 1 000client preﬁxes for the simulations. The simulator runs 10 000 rep-etitions of simulated trafﬁc using input data from the week of Jan-uary 19th–25th 2014 yielding over 24 million trafﬁc streams perclient preﬁx with 18.2 million unique streams. We identify thepresence of AS and IX adversaries using AS-path inference withthe top k paths (k=1 to 5) and our collected traceroute data fromJanuary 19th–January 25th. In total, we have inferred path infor-mation for an average of 18 million streams per client preﬁx (18billion total) and traceroute information for an average of 112 000streams per client preﬁx (14 million total). % T o t a l S t r ea m s ForwardReverseFrom TorForward/Reverse

Figure 6:

Directional AS Compromises for K Top Paths % T o t a l S t r ea m s ForwardReverseFrom TorForward/Reverse

Figure 7:

Directional IX Compromises for K Top Paths % of Streams P e r c en t TR Inf K1 Inf K2 Inf K3 Inf K4 Inf K5

Figure 8:

AS Compromises Measured and Inferred % of Streams P e r c en t TR Inf K1 Inf K2 Inf K3 Inf K4 Inf K5

Figure 9:

IX Compromises Measured and Inferred

We ﬁrst look at the percentage of simulated Tor paths which havethe same AS or IX on both the client-to-guard path and the exit-to-destination path using only the inferred paths. We look at thepercentage of compromised paths considering the set of ASes andIXes in the forward direction (client to destination), the reverse di-rection, the forward and reverse directions combined. We also con-sider the direction of streams leaving Tor; i.e., from the guard tothe client and from the exit to the destination. This matches thedirection of our traceroute measurements from Tor relays to exter-nal IP preﬁxes and allows us to compare the predicted paths withtraceroute data, without errors being introduced due to asymmetricInternet paths that traverse a different set of ASes and IXes. Wecall this the Tor path.Figures 6 and 7 show the percentage of inferred ASes and IXes foreach direction and top k paths averaged over all 18 billion inferredstreams. Considering only the top path, we see 11.6%, 11.6%,12.1% and 21.6% AS compromise rates for the forward, reverse,Tor, and forward/reverse paths respectively. We see a signiﬁcantincrease in AS adversaries when considering more paths topping at58.8%, 60.6%, 62.0%, and 71.8% when considering the top 5 pathsfor the forward, reverse, Tor and forward/reverse paths respectively.We notice little difference between the compromise rates of theTor paths versus the forward or reverse. As expected, the forwardand reverse combined represents a higher inferred compromise ratesince we consider two sets of ASes per path. The forward/reverse has roughly a 10% greater rate of compromise for the top path androughly a 20% greater compromise rate when k is varied from 2to 5. For the top path, the IX compromise rates were higher with27.0%, 17.5%, 29.3% and 43.5% for the forward, reverse, Tor andforward/reverse paths respectively. These increased rapidly at ﬁrstleveling off to 72.3%, 72.5%, 74.7% and 77.2% for the top 5 paths.Once again, the forward/reverse paths contain more potential IXadversaries due to considering more paths. There is little signiﬁ-cant difference between the compromise rates of the forward pathsand the Tor paths. We also note that the number of inferred poten-tial adversaries greatly increases when considering a higher numberof top k paths.

We now compare the inferred AS and IX adversaries to the AS andIX adversaries actually present in the traceroute measurements forall of our simulated Tor circuits. To make the comparison fair, weonly consider the traceroutes and inferred paths going from the Torguard to the client and from the Tor exit to the destination. Asseen in the last section, the inferred paths using this Tor directioncontain similar compromise rates to the paths in the forward andreverse directions. We thus consider the subset of paths for whichwe have both AS inferences and measured traceroutes in the Tordirection giving us a set of 141 million streams from 1000 uniqueclient preﬁxes. % of Streams P e r c en t K=1K=2K=3K=4K=5

Figure 10:

AS Compromise Agreement % of Streams P e r c en t K=1K=2K=3K=4K=5

Figure 11:

AS Compromise False Positives % of Streams P e r c en t K=1K=2K=3K=4K=5

Figure 12:

AS Compromise False Negatives % of Streams P e r c en t K=1K=2K=3K=4K=5

Figure 13:

IX Compromise Agreement % of Streams P e r c en t K=1K=2K=3K=4K=5

Figure 14:

IX Compromise False Positives % of Streams P e r c en t K=1K=2K=3K=4K=5

Figure 15:

IX Compromise False Negatives

AS/IX Compromise CDFs Per Preﬁxes with both TraceRoute and Inferred DataFigure 8 shows the CDF of streams compromised in the traceroutemeasurements compared to the inferred for various k top paths. In-terestingly, the AS compromise rates for the top path is similar tothe actual compromise rates seen in the measurements. Consider-ing the top 2 paths more than doubles the inferred compromise ratewith lower increases with increasing k topping out at a little undera 50% compromise rate for half the paths. Figure 9 shows the CDFof streams compromised with measured versus inferred IX adver-saries. The actual percentage of paths with an IX adversary iden-tiﬁed by preﬁx is much smaller than the inferred value with only.8% of streams seeing an IX adversary on both the client to guardand exit to destination simultaneously. The inferred paths greatlyover exaggerate the threat with the top path giving an average of40% compromise rate and the top 5 ﬁve paths giving an average ofnearly 60% compromise rate. Thus, the method of inferring IX ad-versaries greatly over predicts the number of actual IXes seen whenmeasuring paths using traceroute.We now consider the differences between adversaries seen usingthe inference methods versus the adversaries seen in the tracer-outes. We consider adversaries seen in the inferred set but notin the measured set as false positives and adversaries seen in themeasurements but not the inferred set false negatives. While thetraceroute measurement can contain errors and does not constituteperfect ground truth, we consider it more reliable than the inferredmethods. In the following analysis all percentages are the percent-age of paths compared to the set of all 141 million paths for whichwe have both inferred and measured data.Figures 10 thru 15 show the CDFs for the percentage of streamscompromised per preﬁx for both ASes and IXes that both methodsagree, the inference indicates an adversary while the measurementdoes not (false positives) and the measurement indicates an adver-sary while the inference does not (false negative). We see that whilethe percentage of overall AS compromises for the top path was sim-ilar in the last section, they do not agree on which AS is causing the compromise. In our measurements, we ﬁnd roughly 10.9 % ofstreams could contain a potential AS adversary. Unfortunately, themeasured and inferred AS only agree for an average of 2.6% AScompromises when considering the top path. Increasing to the toptwo paths improves this by a factor of 2 to 5.1% average agreementwith higher k values giving diminishing returns after that. Unfortu-nately, increasing k from 1 to 2 signiﬁcantly increases the averagenumber of false positives from 8.5% to 22.6% with a more linearincrease with k up to 41.1% with k = 5. For false negatives, thegreatest drop once again occurs when going from the top paths tothe top two paths from 8.4% to 5.8% with diminishing returns withincreasing k and a minimum of 4.3% average with k = 5. Overall,the AS inference with the top 2 paths catch a little less than halfthe measured AS adversaries catching 5.1% and missing 5.8% ofthe actually measured 10.9% of measured AS adversaries Unfortu-nately, it still pre-emptively would eliminate 22.6% of paths whichhad no measured adversary.The agreement with the IX adversaries is even lower. Both methodsagree on only .36% of paths having an IX adversary consideringthe top path increasing to .44% for the top two paths up to .47% forthe top 5 paths. The false positive rate is unacceptably high with34.5% for the top path and 48.1% for the top 2 paths up to 55.7%for the top 5 paths. The false negative rate is .54% for the top pathlowering to .45% for the top two paths down to .42% for the topﬁve paths. Thus eliminating paths based on the inference with thetop path would catch 40.0% of the observed IX adversaries (.9% oftotal paths) while eliminating 34.5% of paths unnecessarily. Usingthe top 2 paths would catch 48.9% of observed IX adversaries whileeliminating 48.1% of paths unnecessarily. Thus, roughly half of allpotential paths would be eliminated to catch the .9% of total pathswith an observed IX adversary. This motivates the need for bettermethods of inferring IX adversaries in order to effectively mitigatethe threat to the Tor system.op 1Path Top 3PathsMean fraction of streamsthat have traceroutes 0.0037 0.0026Mean fraction of streamswith traceroutes thatare w/o independence 0.0043 0.0014Min prob of at least onestream w/o independence 0.018 0Mean prob of at least onestream w/o independence 0.11 0.053Max prob of at least onestream w/o independence 0.22 0.18 (a)

Undetected compromise among streams that success-fully connected Top 1Path Top 3PathsMean fraction of allstreams that fail due toindependence constraint 0.051 0.060Mean fraction of streamsthat have traceroutes 0.19 0.19Mean fraction of streamsw/ traceroutes that havean independent path 0.96 0.95Min prob of at leastone stream failure 1 1 (b)

Unnecessary failure among streams without any in-dependent path

Table 2:

Path-independent Tor traceroute analysis over 189 top client ASes

In order to avoid deanonymization by an AS or IX, Tor clients couldattempt to choose Tor relays such that the forward and reverse pathsbetween the client and guard are independent of the forward and re-verse paths between the exit and destination, in terms of the ASesand IXes that appear. However, it is non-trivial to design a sys-tem that allows the client to do so, because he must preserve hisanonymity while making this decision, and Tor should be usableeven by users with little bandwidth and low-powered devices.As we discuss in Section 6, Edman and Syverson [11] presentedthe ﬁrst detailed proposal for solving this problem with a systemthat provides enough data for clients to build an AS Internet mapon which to run AS-path inference. They propose a slightly less accurate algorithm than Qiu and Gao’s for efﬁciency. Juen addedIX inference to this idea [18]. None of this previous work explainshow AS/IX-independent circuits should be created over time , andthus does not consider how path independence interacts with Torguards or circuit reuse. Tor guards in particular are a key Tor fea-ture that defends against malicious observation and deanonymiza-tion [12, 17]. Thus the prior work does not give a clear idea of howwell AS/IX-independent path selection would work even if path-inference techniques were very accurate.The inaccuracy of path-inference techniques is likely to negativelyimpact AS/IX-independent path selection in at least two ways: ( i )missing an AS or IX on a path could cause the user to create a pathvulnerable to deanonymization, and ( ii ) incorrectly believing thatan AS or IX exists on a path could leave the user with few or noways to connect to the destination. These problems are placed intension by the inference methodology because false negatives make( i ) worse and false positives make ( ii ) worse. For example, as thenumber k of top paths used in inference increases, false negativesshould go down but false positives should go up. Moreover, theinference needs to have few false negatives on all paths collectively ,or a user will face an increasing risk of deanonymization as he visitsnew destinations and is forced by network churn to use differentrelays. Similarly, an increasing number of false positives over timecould force the user to choose between not connecting to certaindestinations and exposing himself to more and more potentially-malicious guards.We investigate the suitability of path inference as a basis for AS/IX- independent path selection using path simulation and our traceroutedata, similar to how they were used in Section 5.1 to explore vanillaTor security. As a byproduct of this research, we also expose asecurity-performance tradeoff inherent in the path-independent ap-proach and reveal some opportunities to ﬁll in and improve pastproposals. In order to evaluate AS/IX-independent path selection via simula-tion, we must ﬁll in the details of the algorithm sketched out byprior work. We adapt the existing Tor path-selection algorithm forthis purpose. We require clients to have at least 3 guards in theirguard list and to have at least 2 guards active with AS/IX path infor-mation with the client when creating a new circuit. Upon receivinga stream request, existing circuits are examined for suitability, in-cluding path independence. If none is suitable, then circuit-creationis initiated by choosing an existing guard, then an exit, and thena middle. If a path-independent exit cannot be found for a givenguard, the other guards are considered, and so on. If no exit isfound for any current guard, then the circuit creation fails. We notethat to enable a direct comparison with our traceroute data, our sim-ulator only compares the inferred AS/IX path from the guard to theclient and from the exit to the destination when determining pathindependence (i.e. only reverse entry and forward exit paths areused).We generally follow the same experimental methodology as thatfollowed in Section 5.1. We will not be estimating full distribu-tions, and thus we use only 500 samples per client AS, but we runexperiments with the 189 of the top 200 client ASes that were inour AS-level routing map.In our experimental analysis, we are able to use traceroute data toidentify false negatives and false positives. 50 client IPs are chosenrandomly from the set of the initial IPs in each preﬁx advertisedby the client’s AS (according to the RouteViews preﬁx-to-AS ﬁlethat appears most recently before that stream occurred). To identifyfalse negatives, we test streams that were successfully assigned toa circuit by looking for a traceroute from the guard’s routing preﬁxto the client’s and from the exit’s preﬁx to the destination’s. Whenboth traceroutes are found, we look for ASes or IXes that appear incommon. To identify false positives, we test streams that failed toconnect by looking for a traceroute from any of the active guardst that time to the client and from any potential exit to the destina-tion. If such a pair exists, we look for the lack of any AS or IX incommon.

Table 2a provides estimates for the effects of path inference er-rors on the security of path-independent Tor. The min, mean, andmax values are taken over 188 top client ASes (we further excludedone that didn’t advertise any preﬁxes during the simulation week).Our traceroute data provided path information (i.e. matched bothguard-client and exit-destination host-preﬁx pairs in the directionout from Tor) for 0.26–0.38% of the simulations’ streams (depend-ing on whether the top 1 or the top 3 inferred paths were used to de-termine independence). Of these, between 0.14% and 0.43% wererevealed toﬁgures/ violate path independence. While this may seemacceptably low, even one Tor deanonymization is potentially seri-ous, and over the course of the simulated week, a client had onaverage between a 5.3% and a 11% probability of experiencing atleast one path-independence violation. In the most unlucky clientASes, path independence was violated with a probability as high as18–21%!Table 2b shows that this insecurity cannot simply be handled byincreasing the number of top possible paths from which the in-ferred ASes and IXes are taken. It reveals that by increasing thenumber of top paths used in inference from 1 to 3, the fraction ofstreams for which no path-independent Tor circuit could be createdincreased from 5.1% to 6%. For these streams, no AS/IX path-independent exit could be found using any of the client’s guards.Note that a stream failure of any kind never occurred in simulationwith Tor’s default path selection, because Tor doesn’t require pathindependence, and many exits are available for each stream in theuser trace. Such failures are particularly bad because the streamwill not succeed until the Tor relay population changes sufﬁciently,a process which could take days or weeks. Thus even a 5.1–6%failure rate has a severely deleterious effect on Tor’s suitability forgeneral Internet use. Moreover, we can see that every simulatedclient experienced at least one stream failure (i.e. the estimatedfailure probability is 1.0 for all client ASes).However, our traceroute measurements offer the hopeful news forthis problem that most of these stream failures may have been un-necessary. We were able to match a traceroute guard-to-client andexit-to-destination for 19% of failed streams. Our coverage of failedstreams is so much higher than for connected streams because welook for a traceroute from any active guard of the client at the timeand from any exit that could be chosen with that guard and forthat destination (ignoring only the path-independence constraint).Among streams for which we were able to match at least one pairof traceroutes, 95–96% had a guard and exit that the traceroutesshow would have been AS/IX-independent. In fact, this high false-positive rate is not just a result of having many exit paths aboutwhich to be be incorrect — an average of about 80% of all guardand exit pairs with matched outgoing traceroutes were observed tobe path-independent for both experiments.

Our evaluation of AS/IX-independent path selection is not intendedto make any deﬁnitive claims about its usefulness. Instead, we at-tempt to make reasonable choices about the algorithm details inorder to get some idea of how well it might work overall and espe-cially in conjunction with path-inference techniques. Indeed, thereare many plausible improvements to the algorithm we have evalu- Method Forward Reverse BothFeamster and Dingledine 17.7% 16.1% NAEdmond and Syverson 10.9% 11.1% 17.8%Wacek et al. NA NA 27.39 %Juen 7.1 % 7.2 % 11.2 %Current Work 11.6% 12.1% 21.6%

Table 3:

Inferred AS Compromise Comparison (Top Path) ated, such as choosing guards with different network locations tominimize the chance of stream failure, or perhaps allowing streamsto use potentially unsafe circuits but limiting the number of po-tential observing ASes and IXes. Designing network-aware path-selection algorithms for Tor remains an open challenge with un-solved vulnerabilities such as adversarial relay placement [4] andpath ﬁngerprinting [8, 16].

6. RELATED WORK

The threat to the Tor network for ASes to correlate trafﬁc was ﬁrstinvestigated by Feamster and Dingledine [13]. Using a simpliﬁedAS model with shortest paths they determined roughly 10-30%of circuits could be vulnerable to an AS adversary. Edmond andSyverson furthered the understanding of AS adversaries againstthe Tor network [11]. Using Qiu and Gao’s AS path predictionmodel and an updated model for the Tor network, they determinedeach circuit had an 11-18% chance that some AS adversary couldcompromise the circuit. They also presented a technique to choosepaths without AS adversaries by using "Snapshots” of the AS topol-ogy. Akhoondi et al. presented LastTor, an optimization to Torpath selection to minimize latency by considering geographic loca-tion [4]. They propose using the set of K top most likely AS pathsto eliminate AS adversaries. They do not report overall chancesfor any given AS to compromise a circuit. Recently, Wacek et al.studied Tor’s path selection algorithm [26]. They ﬁnd that usingthe iPlane’s Nano AS map, Tor paths have a 27.39% chance to bevulnerable to an AS adversary.The danger of IX adversaries was ﬁrst demonstrated by Murdochand Zielinski who demonstrated that an IX could use a Bayesianapproach to sample trafﬁc and correlate Tor ﬂows across ASes peer-ing at the IX [23]. Juen further investigated the threat of AS andIX adversaries using Qiu and Gao’s AS model and the top K pathsestimating the chance of any AS being able to compromise the cir-cuit ranging from 10% to 42% [18]. He reports the chance of an IXcompromise to be between 1 % and 20 %. Johnson et al. investi-gate the amount of time required for an AS, IX, or IX organizationto compromise a circuit using Torps to simulate realistic Tor traf-ﬁc [17]. They only consider the top 3 AS and IX adversaries asseen in their inferred data and report the overall chance of an AScompromise to be 1.6 % for their top 3 ASes.We now compare our results with the compromise rates of Torstreams against previous work. We calculate the percentage of Torstreams which contain an AS on the client to guard and exit todestination paths in the forward, reverse and forward and reversedirections for each of our 18 billion calculated streams. We thencompare the results of our directional AS path inferences directly tothe results from previous work and conﬁrm that our AS path infer-ences give similar results for the top AS path as shown in Table 3.We ﬁnd our results most closely correlate with the work of Edmondand Syverson with Juen’s results being lower than the average andWacek et al. being much higher. We ﬁnd this unsurprising since weS Our Rank Johnson et al. Rank Johnson et al. Comp % Comp % TR Comp %AS6939 HURRICANE Electric 1 3 .6% .4% 0.0%AS3356 Level 3 Communications 2 1 .4% .5% .13%AS1299 TeliaNet Global 3 2 .4% .5% .5 %IX Our Rank Johnson et al. Rank Johnson et al. Comp % Comp % TR Comp %LINX Juniper 1 NA NA .4% .05%DE-CIX Frankfurt 2 1 .1 % .4% .05%Equinix Ashburn 3 NA NA .4% 0.0%

Table 4:

Stream Compromise Rates for the Top 3 AS and IX Adversaries for our Work compared to Johnson et al. also use the AS inference algorithm from Qiu and Gao. We surmisethat the AS inference from iPlanes produce higher compromise es-timates as seen in Feamster and Dingledine and Wacek et al. Juenalso uses a modiﬁed AS mapping algorithm which may producelower compromise rates.Johnson et al. investigated the time expected before a user wouldmost likely use a stream compromised by an AS or IX adversary.Since we only have inferred and traceroute data for .8% of streams,it is not possible to directly compare the time to compromise forour clients. Instead, we investigate the ability of the top 3 AS andIX adversaries to compromise a Tor stream. Once again, we onlyconsider streams which we have both inferred and traceroute data.The ASes and IXes with the highest probability to compromise aTor stream are shown in Table 4. Interestingly, we observe the sameset of three top AS adversaries but in a slightly different order. Wealso see the top IX adversary as number 2 for compromise rates.We see a similar .5% rate of AS streams compromised by our topAS adversaries. We see a higher rate of overall streams compro-mised by our top IX adversary at .4% compared to roughly .1% inJohnson’s work. The overall compromise rates of all streams usingthe traceroute measurements is much more interesting. The tracer-oute measurements never see AS6939, the top compromising ASin the inferences. On the other hand, the traceroute measurementsindicate AS1299 can compromise the same percentage of streamsas indicated in the inference. We see drops for both top IX compro-mise rates for the measurements versus the inferences. Once again,Equinix Ashburn is not observed in the traceroute measurements.Overall, we expect a drop in the actual ability of an IX point tocompromise a Tor stream primarily due to the extremely high falseinference of IX points. The difference between AS compromiserates is more interesting. This shows that the path inference was al-ways wrong in identifying AS6939 when compared to traceroutes;however, AS 1299 was seen in the traceroutes at similar rates aspredicted. Thus, the inference accuracy appears to vary greatly de-pending on which AS is being considered as an adversary.

7. LIMITATIONS AND FUTURE WORK

This data in this study is limited in several ways. While our volun-teer measuring relay covered roughly 25% of Tor selection proba-bility, it still only contains 28 hosts. In addition, all path inferenceswere done on paths from Tor relays, leaving us without symmet-ric path information. Furthermore, we collected most of our data inthe span of weeks, and so missed alternative routing paths and rout-ing instabilities. We also lack ground truth because of measurementweaknesses such as missing or incorrect traceroute hops, missing orstale IP preﬁx announcements from the public route collectors, andincomplete or incorrect IXP preﬁx data. We look forward to the op-portunity to expand network measurement in cooperation with Tor and using third-party vantage points such as Looking Glass servers.We also hope to make use of advances in measurement tools to ad-vance this line of inquiry.A larger remaining challenge is to move past the current focuson AS-level techniques and adversaries. As Jaggard et al. de-scribe [15], an adversary may ﬁnd it easier to control a group of IProuters running a certain version of software than to observe thosein the same AS or IXP. Evaluating the threat of more complicatedand realistic network adversaries will require both better adversarymodeling and more detailed route inference techniques.

8. CONCLUSIONS

We have presented a measurement study to evaluate the suitabilityof Internet AS and IXP path-prediction algorithms to assess andmitigate the threats from network-level adversaries to the Tor net-work. Using traceroute data from the volunteer operators of 28 Torrelays, we show that current techniques for inferring AS-level In-ternet paths and the IXes between them signiﬁcantly overestimatenumber of ASes and IXes traversed by Tor trafﬁc.To evaluate what this means about the current and future securityof Tor, we perform Monte Carlo simulations of Tor’s current path-selection algorithm and the AS/IXP-independent path-selection al-gorithm proposed in the literature. When we examine the results,we see evidence that Tor is likely less vulnerable to an AS or IXPadversary than has been previously found. A direct comparisonwith a prior evaluation shows that it is likely to have overstated therisk of a single AS many times over and that of a single IXP by anorder of magnitude.We also ﬁnd that the AS/IXP-independent path-selection algorithmmay still leave a signiﬁcant chance for users to be deanonymizedover time due to the errors in path prediction — we estimate a 5–11% risk in just one week when the claimed chance is 0. Moreover,we ﬁnd that this algorithm appears to force a tradeoff between con-nection failures and exposing users to potentially-malicious relays,even though in nearly all cases the failures could be avoided withbetter measurement.Our results suggest the importance of accurate measurement bothfor understanding Tor security and for improving it.

9. REFERENCES

Proceedings of the ACM SIGCOMM 2012 conference onApplications, technologies, architectures, and protocols forcomputer communication , pages 163–174. ACM, 2012.[4] M. Akhoondi, C. Yu, and H. Madhyastha. Lastor: Alow-latency as-aware tor client. In

Security and Privacy (SP),2012 IEEE Symposium on , pages 476–490, May 2012.[5] B. Augustin, X. Cuvellier, B. Orgogozo, F. Viger,T. Friedman, M. Latapy, C. Magnien, and R. Teixeira.Avoiding traceroute anomalies with Paris traceroute. In

Proceedings of the 6th ACM SIGCOMM conference onInternet measurement , pages 153–158. ACM, 2006.[6] B. Augustin, B. Krishnamurthy, and W. Willinger. IXPs:mapped? In

Proceedings of the 9th ACM SIGCOMMconference on Internet measurement conference , IMC ’09,pages 336–349, New York, NY, USA, 2009. ACM.[7] K. Bauer, D. McCoy, D. Grunwald, T. Kohno, and D. Sicker.Low-resource routing attacks against Tor. In

Proceedings ofthe Workshop on Privacy in the Electronic Society (WPES2007) , 2007.[8] G. Danezis and P. Syverson. Bridging and ﬁngerprinting:Epistemic attacks on route selection. In

Proceedings of the8th International Symposium on Privacy EnhancingTechnologies , PETS ’08, 2008.[9] R. Dingledine, N. Hopper, G. Kadianakis, andN. Mathewson. One fast guard for life (or 9 months). In , 2014.[10] R. Dingledine, N. Mathewson, and P. Syverson. Tor: Thesecond-generation onion router. In

Proceedings of the 13thUSENIX Security Symposium , August 2004.[11] M. Edman and P. F. Syverson. AS-awareness in Tor pathselection. In

Proceedings of the 2009 ACM Conference onComputer and Communications Security, CCS 2009 , 2009.[12] T. Elahi, K. Bauer, M. AlSabah, R. Dingledine, andI. Goldberg. Changing of the guards: A framework forunderstanding and improving entry guard selection in tor. In

Proceedings of the 2012 ACM Workshop on Privacy in theElectronic Society , WPES ’12, pages 43–54, New York, NY,USA, 2012. ACM.[13] N. Feamster and R. Dingledine. Location diversity inanonymity networks. In

Proceedings of the Workshop onPrivacy in the Electronic Society (WPES 2004) , 2004.[14] L. Gao. On inferring autonomous system relationships in theInternet.

IEEE/ACM Transactions on Networking , 9(6),December 2001.[15] A. D. Jaggard, A. Johnson, P. Syverson, and J. Feigenbaum.Representing network trust and using it to improveanonymous communication. In

In 7th Workshop on HotTopics in Privacy Enhancing Technologies (HotPETs 2014) ,2014.[16] A. Johnson, P. Syverson, R. Dingledine, and N. Mathewson.Trust-based anonymous communication: Adversary modelsand routing algorithms. In

Proceedings of the 18th ACMConference on Computer and Communications Security(CCS 2011) , pages 175–186. ACM, 2011.[17] A. Johnson, C. Wacek, R. Jansen, M. Sherr, and P. Syverson.Users get routed: Trafﬁc correlation on Tor by realisticadversaries. In

Proceedings of the 20th ACM Conference onComputer and Communications Security (CCS ’13) . ACM,2013.[18] J. P. J. Juen. Protecting anonymity in the presence ofautonomous system and Internet exchange level adversaries.Master’s thesis, University of Illinois, 2012.[19] M. Luckie. Scamper: a scalable and extensible packet proberfor active measurement of the internet. In

Proceedings of the10th ACM SIGCOMM conference on Internet measurement ,pages 239–245. ACM, 2010.[20] M. Luckie, B. Huffaker, k. claffy, A. Dhamdhere, andV. Giotsas. AS relationships, customer cones, and validation.In

Internet Measurement Conference (IMC) , 2013. [21] M. Luckie, Y. Hyun, and B. Huffaker. Traceroute probemethod and forward IP path inference. In

Proceedings of the8th ACM SIGCOMM Conference on Internet Measurement ,IMC ’08, 2008.[22] Z. M. Mao, J. Rexford, J. Wang, and R. H. Katz. Towards anaccurate AS-level traceroute tool.[23] S. J. Murdoch and P. Zieli´nski. Sampled trafﬁc analysis byInternet-exchange-level adversaries. In

Proceedings of theSeventh Workshop on Privacy Enhancing Technologies (PET2007) , 2007.[24] J. Qiu and L. Gao. AS path inference by exploiting knownAS paths. In

GLOBECOM , 2006.[25] P. Syverson, G. Tsudik, M. Reed, and C. Landwehr. Towardsan analysis of onion routing security. In H. Federrath, editor,

Proceedings of Designing Privacy Enhancing Technologies:Workshop on Design Issues in Anonymity andUnobservability , pages 96–114. Springer-Verlag, LNCS2009, July 2000.[26] C. Wacek, H. Tan, K. S. Bauer, and M. Sherr. An empiricalevaluation of relay selection in tor. In

NDSS , 2013.[27] Y. Zhang, R. Oliveira, Y. Wang, S. Su, B. Zhang, J. Bi,H. Zhang, and L. Zhang. A framework to quantify the pitfallsof using traceroute in AS-level topology measurement.