rTraceroute: Réunion Traceroute Visualisation
aa r X i v : . [ c s . N I] M a r rTraceroute: Réunion Traceroute Visualization Xavier Nicolay ∗ , Réhan Noordally ∗ and Yassine Gangat ∗†∗ Laboratoire d’Informatique et de Mathématiques † Laboratoire d’Energétique, d’Electronique et ProcédésUniversity of Reunion Island, 15 Rue René Cassin, 97490 Sainte Clotilde, Franceemail: fi[email protected]
Abstract —Traceroute is the main tools to explore Internet path.It provides limited information about each node along the path.However, Traceroute cannot go further in statistics analysis, or
Man-Machine Interface (MMI) .Indeed, there are no graphical tool that is able to draw all pathsused by IP routes. We present a new tool that can handle morethan 1,000 Traceroute results, map them, identify graphicallyMPLS links, get information of usage of all routes (in percent)to improve the knowledge between countries’ links. rTraceroutewant to go deeper in usage of atomic traces. In this paper, we willdiscuss the concept of rTraceroute and present some example ofusage.
I. I
NTRODUCTION
A. Context
Since the beginning of the Internet, the network has evolvedfrom a simple graph with few vertices to a wide and a complexgraph with uncountable number of nodes.To improve our understanding of the Internet routing andits topology, active metrology is the only way possible withthe injection of packets in the network. Traceroute is one ofthe most widely used network measurement tools. It reportsan IP address for each network-layer device along the pathfrom a source to a destination host in an IP network. Networkoperators and researchers rely on Traceroute to diagnosenetwork problems and to infer properties of IP networks, suchas the topology of the Internet.Reunion Island is a small territory located in the IndianOcean, near South Africa, off the eastern coast of Madagascar.However it is also one of the overseas departments of France.For Reunion Island, this issue is more important than otherplaces due to its particular connectivity to the Internet basedon two submarine cables, called
South Africa Far-East (SAFE) and
Lower Indian Ocean Network (LION) II , respectively ingreen and in red in figure 1. If in one way, we know thephysical path of the Internet connection, in the other way, noone knows the logical path of our connection. This is whywe want to graph the links between Reunion Island and theworld-wide destination. This new representation will improvethe global knowledge of the local topology.
B. Project
The council of Region Reunion have financed a metrologyproject, with
European Regional Development Funds (ERDF) to identify the difficulties of the Reunion Island connectivity.The requirements for this project are:
1) To identify the first hop after any connection leaveReunion Island and the last hop before it reaches us. Thiswill help on the characterization of the logical exits andentries for Reunion Island.2) To detect the
MultiProtocol Label Switch (MPLS) linkbetween Reunion Island and the different hops. It willhelp to understand how are the exits and entries forReunion Island.3) To focus on the regional peering problem. This item willallow us to see if the physical cable and the routingpolicies could be superposed.4) To identify the most encountered node after (resp. before)leaving (resp. reaching) Reunion Island. This could im-prove our knowledge of the routing rules of all
InternetService Providers (ISP) present on the territory.5) To measure the minimal delay associated with each linksbetween Reunion Island and the next (resp. previous) hop.It could detect prioritized link based on the final (resp.original) destination (resp. source).However identifying these points are not easily obtained withsmall data-set. We need to analyze quite big data-set, morethan one probe for each
ISP present on the Island. Usually,each network tool generates its own measurements beforecreating an output in text and/or image format. It is difficultto combine data from several sources (resp. destination) from(resp. to) a same country without a platform measurement.Moreover, before running an analysis we need to proceedmethodically as follows:1) Cleaning data-sets.2) Matching data-sets from different tools, with differentstandards.3) Adding Geo-localization.4) Showing result on a map.o help solve this problem, we propose rTraceroute , a toolwhich is "easy to use" but can provide a map together withstatistics by analyzing data from different Traceroute sources.This tool take only one parameter: a folder which contains datafrom Traceroute tools. It gives us then two output: the first oneis a statistic file about each node meet and the second is a mapwith the different link between countries.Thus, rTraceroute allows us to save time and to focus onthe analyzing of outputs. The rest of this paper is organized asfollows. At first, we will describe the some existing tools insection II. The section III describes rTraceroute. An exampleof how it works is presented in IV and it will be followed bya conclusion in section V.II. E
XISTING TOOLS
As things stand today, no tool is available to fill our needs.However, we can find some tools that could answer some ofthem. Indeed, there are tools for the identification of the pathfrom a probe to a destination. This is the reason why we havecreated rTraceroute to fill the need from this project.In this section, we will present several tools that could helpus. We have categorized them as follows:1) Tools that give raw results, without any statistics or map.2) Platforms measurement that can take advantage of theprevious tools and use them together.3) Mapping tools that can draw their result on a map.
A. Raw Results
These tools would provide us data from one measurement,but they have not been built to generate a large raw datawithout a script.
1) Traceroute:
Traceroute [1] is the most popular toolto measure path length and to determine the routes thatpackets follow from a source node to a destination node.With the growth of the Internet, many paths can be foundwith Traceroute. But the main flaw of this tool is that it couldonly generate raw data (without any statistic or mapping).
2) Paris-Traceroute:
Paris-Traceroute [2] is a Traceroute-like tool, but it is less sensitive to the load-balancingphenomena. The output of this tool announces some errorsin the paths, like
Unreachable Network (!N) or UnreachableHost (!H) . The using of explicit MultiProtocol Label Switch(MPLS) [3] link is also provided by the tool. Using Paris-Traceroute in our work could only identify the MPLS linkbut it has the same flaws as Traceroute.
3) TraIXceroute:
TraIXceroute [4] is also a Traceroute-like tool with the discovering of
Internet eXchange Point(IXP) in the paths. This version of Traceroute combines twodatabases,
PeeringDB [5] and
Packet Clearing House [6], forthe identification of these particular nodes. This informationcould help for the reconstruction of the country crossed.Despite an accuracy of 92-93%, this tool is not adapted for ourstudy because it can only detect the regional routing problemwith the IXP databases [4].
4) Reverse-Traceroute:
Asymetric path is frequent in theInternet, and it’s very difficult to identify the reverse path.To avoid this, [7] has presented
Reverse-Traceroute . This toolaims to identify the return path of a classic Traceroute, withan accuracy of 87% for hops identification. It’s mainly basedon two IP options,
IP Record-Route and
IP Timestamp op-tion . Like the original Traceroute, reverse-Traceroute can onlygenerate Traceroute raw data but for the return way. Anothermethod to obtain the reverse path is using a measurementplatform.
B. Measurement Platform
According to [8], an Internet measurement platform isan infrastructure of dedicated probes that periodically runnetwork measurement tests on the Internet.
1) Atlas RIPE NCC:
Atlas RIPE NCC [9] is an activemeasurement platform, which allowed to use several activetools, like ping, DNS, HTTP or NTP for example. Thereare two different tools for path measurement. The defaultone is Paris-Traceroute. The second one is the originalTraceroute. Atlas platform return the results in
JavaScriptObject Notation (JSON) format. It is possible also to map thepath with
OpenIPMap [10]. Despite the fact Atlas could useParis-Traceroute, made mapping with geo-localization andcompare large data produced by its measurement, this platformis not adapted to our project because it lacks the statistics part.
2) Planet-Lab:
In 2003, some American researchersdeployed a test-bed platform called Planet-Lab [11]. In 2008,a European portion of Planet-Lab [12] have been created. OnPlanet-Lab nodes, researchers can install the tool they needfor their works. With this politic, this platform could combineseveral tools. Nevertheless, it is unable to identify the last(resp. first) hop after (resp. before) reaching (resp. leaving)Reunion Island. Moreover, the management of large data isnot allowed on PlanetLab.
3) Archipelago: Center for Applied Internet Data Analysis(CAIDA) has deployed its own platform in 2007, calledArchipelago [13]. The probe connected to this platform al-lows us to make five main measurements, including ping and
Traceroute . But Archipelago, as Traceroute, could onlygenerate raw data without any analysis.
C. Mapped results1) GTrace:
The first tool created to graph Traceroute datawas GTrace [14]. This tool generates its own Traceroute dataand draw it on a map. It works with city name abbreviationsor airport codes, lookup client and two IP databases to validateeach IP addresses location. This tool was not maintained fora long time and doesn’t support the x86 architecture. Due tothis, we were unable to test it.
2) Topology Visualization Tool:
In [15], a tool centralizedon analysis topology in Africa was presented. This tool cangenerate its own Traceroute measurement and graph it afterrequesting information about nodes from several databases.he design is largely based on existing visualization tools,such as
OpenIPMap project , and focuses only on the Africancontinent. The source code of the tool was not indicated inthe article, thus no test has been made.The table I resumes the possibilities of each tool presentedpreviously. The cells in gray represent the available optionsfor each tool, when the white cells mean that the option is notpresent.The header line of the table show our needs: • The identification of the last hop corresponds to thecountry where the node is hosted before leaving (resp.joining) a specific country. Only a graph tool can identifythis point. • MPLS mark detect the explicit MPLS link between twonodes. Only Paris-Traceroute can detect these, when ourrTraceroute with the graph part can detect invisible MPLS route. • Statistics part needs a large amount of data for theanalyze. In general, the raw results generator does notprovide any analysis. Even graphic tools can’t do it be-cause it redraws on the map on each new test. rTraceroutecan analyze a large dataset and make some statistics oneeach meeting node. • The geo-localization of each IP address is based on thedeclaration of the RIR plus the ISP. It’s well known thatthe Geo-localization is not accurate science. rTraceroutepropose a database which could be improved by the useranalysis. • In the general case, mapping tools need to generate theirown data before draw them on a map. The first problemconcerns the choice of the map and the possibility todownload the final result. With rTraceroute, you canchoose your own map as input and download the resultsfor any publication or presentation. Mapping data allowsus to identify the different countries pass through anddetect the potential errors. With rTraceroute, these phe-nomena could be easily detected by the mapping and thestatistics part. • The database is mainly used for Geo-localization of IPaddress or detection of the IXP. Our database is, for now,focus on Reunion Island with the help of delay analysis. • The regional routing problem means the identifica-tion of peering mistakes, like boomerang-routing phe-nomenon [16]. • Each tool provide to generate its own raw data one byone. rTraceroute propose to analyze a large dataset forthe statistics and mapping part.Due to the imperfections of these tools (relatively to ourneed), we have made the choice to develop a new tool: rTraceroute . III. R T RACEROUTE
A. The concept rTraceroute have been created to simplify the Traceroutedata file analysis. On the one hand, we must process a huge number (morethan 1 million) of Traceroute files. On the other hand, weare looking for performance. The choice to handle this isto write the program with standard C, keeping in mindthat it should be compatible (
Linux, OS X , ...), without anyplatform-specific code. We also want the program to be easyto compile (this is why the code is a monolithic bloc, forless than three thousand lines), with a very simple Makefile.Today, the executable run on a dedicated server with
LinuxUbuntu 14.0.4 LTS .The figure 2 schematize the concept of rTraceroute. Theinputs used by rTraceroute are : • Several files of Traceroute data, whether it comes fromParis-Traceroute [2] or Atlas platform [9], in text format. • The URI of the database used for geo-localization. • A Map.Then, rTraceroute will proceed as follows:1)
Parsing and Cleaning : As the input data from Paris-Traceroute [2] or Atlas platform [9] are text files, theprogram must parse JSON files as well as plain textfiles. The parse procedure for JSON files is made withNXJson [17]. To be more efficient, the program has a"cleaning procedure" (as it has been explained in I-B) that eliminates useless (or corrupted) files. The pre-parsing subroutine uses the following exclusion pattern: • JSON corrupted. • JSON file contains 3 last RTT as (zero) [unreach-able]. • Traceroute corrupted [file contains more than oneTraceroute]. • Traceroute contains 3 stars on the last hop [hostunreachable]. • Traceroute contains !H or !N or WARN string [unreach-able].2)
Statistics : During parsing files, rTraceroute create aninternal linked-chain memorizing all information. Each IPaddress and hop are marked with an on the fly calculatedRTT; later completed with the localization (country) fromthe MySQL internal database.3)
Geo-localization : From this step, we need to geo-localizeIP address. This is why the program needs a MySQLdatabase (connection information is "hard-coded" inmain.h). . Two tables are used: the first one to geo-localizeIP address and the second one to place the country on amap.Then, to be able to draw each trace, gdlib has been used.To store all information obtained during the process, threekinds of tuples are used: • {hop, ip address, rtt, occurrences} isadded when reading files. • {mapx1, mapy1, mapx2, mapy2,ip_address1, ip_address2, link} iscreated when drawing is done, to handle all MPLSlinks. ABLE IT
OOLS OPTION AVAILABILITY
Need
Tools
Last Hop identification MPLS Mark Statistics Geo-localization Mapping Database Regional Routing problem Large data
TracerouteParis-Traceroute
YES
TraIXroute
YES YES
Reverse-TracerouteAtlas RIPE NCC
YES YES YES YES YES YES
PlanetLab
YES YES YES YES
ArchipelagoGtrace
YES YES YES YES
African Visual Route
YES YES YES YES rTraceroute
YES YES YES YES YES YES YES YES
Fig. 2. Concept of rTraceroute • {xdeb, ydeb, xfin, yfin, occurences,rtt, mpls} is also handled when the programdraws.4) MPLS Detection : Paris-Traceroute can detect explicitMPLS links and marked them in its raw data. WhenrTraceroute find the marks, it memorized the link to plotthem in a different coloring at the end of the mapping.5)
Mapping : Some useful options have been implementedas the program can handle html color’s name and createstatistics: • It can draw only the last links (the last hop beforecoming in one country). • It can make the thickness of the link proportional ofits use (with the -redraw option). • It colorize MPLS links.The program produces three result files: • A simple one that contains only 5 columns: hop_position, IP address, occurrence, average_delay, country , used for a laterstatistical purpose. • A very verbose one called trace.txt that explicitlydetails all internal operations. From this text file, it ispossible to extract and produce more results (as statisticsin percent of use’s link). for examples: – to extract all MPLS links: $ grep ’lien.*MPLS’ trace.txt and we obtain this kind of result: // lien <-> MPLS: 93.17.132.110109.24.74.178 – to find information on each drawing line (x1,y1 tox2,y2), occurrences, occurrence in percent between alllines, minimal time, geo-localization, in text, for allsegments: grep ’__ trajet’ trace.txt and we obtain this kind of result: __ trajet: 3626 1638 -> 2581 1582:105 (2.81%) [AU - RE et temps min05.39 (2.81%)] • Two graphical files (maps): all segments of the input fileson the first one and only the segments of the last hop fora destination point (x,y) on the last map.rTraceroute is the most adapted tool for our project. Indeed,its capacity of mapping help us to answer the first need ofthe project, which is the identification of the first (resp. lat)hop after (resp. before) leave (resp. reach) Reunion Island.The map function also help us on the third item of theproject: the regional peering problem. Paris-Traceroute allowto detect explicit MPLS link. With rTraceroute, even the invisible MPLS [3] link is shown with the maps option. Thestatistics part of rTraceroute permits to answer the last twopoints, which are respectively the identification of the mostencountered node after (resp. before) leaving (resp. reaching)Reunion Island and the minimal delay associated with eachlinks between Reunion Island and the next (resp. previous)hop.IV. A
PPLICATION OF R T RACEROUTE ON R EUNION I SLAND
This paper has begun with a quick presentation of differentactive tools or platforms. This part will show how rTracerouteused the data provided by these tools for analysis path froma country to several destinations. The section IV-A explainhow rTraceroute can be used with direct Paris-Traceroute data,when IV-B used Paris-Traceroute JSON formatted from AtlasRIPE NCC platform as data. The results have been previouslyanalyzed in [18] and are available on the website of theauthors.
A. Paris-Traceroute dataset
In this example, we will test rTraceroute on the data-setpublicly available and coming from [18]. This large data wasobtained during one month, between third of July and third ofAugust 2016. With 27 Raspberry Pi[19] used as measurementprobes, each day we have tried to reach , IPv4 spreadaround the world as destination. At the end of the measurementcampaign we have obtained a total of , , raw traces.Combined with rTraceroute, the raw data have been reducedto , , available for path analysis. The using of rTracer-oute on Paris-Traceroute data have these advantages: • The first one concern the raw data. The tools can filteronly the data which fill the selection criteria presented inthe section III • Mapping the path makes the path analysis between coun-tries easier. Along with mapping, it is possible to printonly the first (resp. last) node before (resp. after) leaving(resp. reaching) a country with the path and the delayassociated. • Lastly, rTraceroute can change all explicit MPLS link’scolor found by Paris-Traceroute. The mapping wouldpermit the detection of invisible MPLS links.The map 3 represent the results for the first hop when dataleave Reunion Island.In this figure, we can see the first hop after Reunion Island. Despite the presence of two physical connections with Asiaand Africa 1, we can notice that most of the first hops are notconnected to these continents but to Europe.Another interesting point is the presence of a direct linkbetween Reunion and USA, sign of invisible MPLS link. Wecan also see that more of 96% of our data going throughFrance before joining the destination, even if the destinationis close to Reunion Island. A detailed analysis of the resultsare available in [18].The analysis of the same data with only a
BASH script hastaken more than one day. With rTraceroute, around 10 minutesis sufficient.
B. Atlas RIPE NCC dataset
In this case study, we would analysis , measurementtowards Reunion Island from all world. Our trace includesmeasurement performed from the third of July to the third ofAugust 2016. , atlas probes would reach ten RaspberryPi [20] deployed over Reunion Island. The atlas probe wasselected with the same distribution as the IPv4 addresses forthe Paris-Traceroute case.Without rTraceroute, we would have been constraint by usingOpenIPMap [10]. The tools present some limits: it requiresto add manually the other path measurements to compare theresults and the maps can’t be downloaded. The "cleaning step"of rTraceroute have reduced our data to , traces. Theadvantage of combine rTraceroute with Atlas is the same aswith the other Traceroute tools. But in this context, we canhighlight three benefits: • First we can easily combine data from several Atlas mathmeasurements for the analysis. A script could downloadthe JSON file and add it in a folder which could also beanalyzed by rTraceroute. • The maps are hosted on the system which run rTracer-oute. Maps can be downloaded and used after. • Eventually, the statistics file created by rTraceroute isvery important. Each line of the file contains the positionof an IP address meet during the measurement, the IPaddress, the occurrence of this couple, the meaning delayand the country associated with the IP address.The map 4 is an example of all path existing to join ReunionIsland using Atlas and rTraceroute. On the figure 4, we cansee a direct link between Paraguay and Reunion Island withouta submarine cable existing. This proves us that an invisibleMPLS link exists. Without rTraceroute, this analysis will bemore difficult to be found. We can also notice the use of thesection of the submarine cable from Asia and Africa. Thepercentage of using the exit point for submarine cable in thetwo continent is less than 0.5%, when more than 99% goingthrough France, with a minimal delay of . ms.This campaign of measurement from the world towards Re-union Island has provided , JSON raw data for atime analysis of 3 minutes with rTraceroute. A
BASH scriptdeveloped for the analysis of these raw data has taken morethan half a day. ig. 3. Paris-Traceroute results exampleFig. 4. Example of results from rTraceroute used in combination with Atlas
V. C
ONCLUSION
Studying the Internet connectivity of Reunion Island is im-portant for the economic development of the island. An
ERDF project has been funded to identify the difficulties of theInternet connectivity of the Island. Several tools are available,but none of them could cover our needs. Nearly no toolscan handle a large number of Traceroute-files and draw all the path on a map. This why we have implemented a newtool, called rTraceroute . It also offers us a new approach toobserve the Internet topology with a different point of view.As for today, there is no IP geo-localization database reliableenough to plot every IP found in the traces. The next stepwill be to improve IP geo-localization of each point, basedon delay between to point and a system of triangulation. IPv6eployment is in its infancy: the routes could not be optimized.Adding IPv6 in rTraceroute could also help the deploymentof this new protocol in the area, but some adaptations needto be done for IPv6 because the actual work is based onIPv4. Exportation of maps in
Encapsulated PostScript (EPS) or Scalable Vector Graphic (SVG) format is also planned inthe future (actually, the maps result are only in PNG format).R
EFERENCES[1] G. S. Malkin, “Traceroute using an IP option,” 1993.[2] B. Augustin, X. Cuvellier, B. Orgogozo, F. Viger, T. Friedman, M. Lat-apy, C. Magnien, and R. Teixeira, “Avoiding traceroute anomalies withparis traceroute,” in
Proceedings of the 6th ACM SIGCOMM conferenceon Internet measurement . ACM, 2006, pp. 153–158.[3] B. Donnet, M. Luckie, P. Mérindol, and J.-J. Pansiot, “Revealingmpls tunnels obscured from traceroute,”
ACM SIGCOMM ComputerCommunication Review , vol. 42, no. 2, pp. 87–93, 2012.[4] G. Nomikos and X. Dimitropoulos, “traixroute: Detecting ixps in tracer-oute paths,” in
International Conference on Passive and Active NetworkMeasurement
NSDI , vol. 10, 2010, pp. 219–234.[8] V. Bajpai and J. Schönwälder, “A survey on internet performancemeasurement platforms and related standardization efforts,”
IEEE Com-munications Surveys & Tutorials , vol. 17, no. 3, pp. 1313–1341, 2015.[9] R. NCC, “RIPE atlas,” 2010. [Online]. Available: https://atlas.ripe.net[10] E. Aben, “Infrastructure geolocation -plan of action,” 2015. [Online]. Available:https://labs.ripe.net/Members/emileaben/infrastructure-geolocation-plan-of-action[11] B. Chun, D. Culler, T. Roscoe, A. Bavier, L. Peterson,M. Wawrzoniak, and M. Bowman, “PlanetLab: An overlaytestbed for broad-coverage services,”
SIGCOMM Comput. Commun.Rev. , vol. 33, no. 3, pp. 3–12, Jul. 2003. [Online]. Available:http://doi.acm.org/10.1145/956993.956995[12] “PlanetLab europe, an open platform for developing, deploying,and accessing planetary-scale services.” [Online]. Available:http://planet-lab.eu/[13] K. Claffy, Y. Hyun, K. Keys, M. Fomenkov, and D. Krioukov, “Internetmapping: from art to science,” in
Conference For Homeland Security,2009. CATCH’09. Cybersecurity Applications & Technology . IEEE,2009, pp. 205–211.[14] R. Periakaruppan, E. Nemeth et al. , “GTrace: A graphical traceroutetool.” in
LISA , vol. 99, 1999, pp. 69–78.[15] C. Yang, H. Suleman, and J. Chavula, “A topology visualisation tool fornational research and education networks in Africa,” in
IST-Africa WeekConference, 2016 . IIMC, 2016, pp. 1–11.[16] J. A. Obar and A. Clement, “Internet surveillance and boomerangrouting: A call for Canadian network sovereignty,” in
TEM 2013: Pro-ceedings of the Technology & Emerging Media Track-Annual Conferenceof the Canadian Communication Association (Victoria) , 2012.[17] Y. Stavnichiy, “Nxjson, a light json parser written in c.” [Online].Available: https://bitbucket.org/yarosla/nxjson/[18] R. Noordally, X. Nicolay, P. Anelli, R. Lorion, and P. U. Tournoux,“Analysis of internet latency : the reunion island case,” in