Is this you? Create Your Porfile

Romain Fontugne

Graduate University for Advanced Studies

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Romain Fontugne is active.

Explore More

Publication

Featured researches published by Romain Fontugne.

conference on emerging network experiment and technology | 2010

MAWILab: combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking

Romain Fontugne; Pierre Borgnat; Patrice Abry; Kensuke Fukuda

Evaluating anomaly detectors is a crucial task in traffic monitoring made particularly difficult due to the lack of ground truth. The goal of the present article is to assist researchers in the evaluation of detectors by providing them with labeled anomaly traffic traces. We aim at automatically finding anomalies in the MAWI archive using a new methodology that combines different and independent detectors. A key challenge is to compare the alarms raised by these detectors, though they operate at different traffic granularities. The main contribution is to propose a reliable graph-based methodology that combines any anomaly detector outputs. We evaluated four unsupervised combination strategies; the best is the one that is based on dimensionality reduction. The synergy between anomaly detectors permits to detect twice as many anomalies as the most accurate detector, and to reject numerous false positive alarms reported by the detectors. Significant anomalous traffic features are extracted from reported alarms, hence the labels assigned to the MAWI archive are concise. The results on the MAWI traffic are publicly available and updated daily. Also, this approach permits to include the results of upcoming anomaly detectors so as to improve over time the quality and variety of labels.

international conference on computer communications | 2014

Hashdoop: A MapReduce framework for network anomaly detection

Romain Fontugne; Johan Mazel; Kensuke Fukuda

Anomaly detection is essential for preventing network outages and maintaining the network resources available. However, to cope with the increasing growth of Internet traffic, network anomaly detectors are only exposed to sampled traffic, so harmful traffic may avoid detector examination. In this paper, we investigate the benefits of recent distributed computing approaches for real-time analysis of non-sampled Internet traffic. Focusing on the MapReduce model, our study uncovers a fundamental difficulty in order to detect network traffic anomalies by using Hadoop. Since MapReduce requires the dataset to be divided into small splits and anomaly detectors compute statistics from spatial and temporal traffic structures, special care should be taken when splitting traffic. We propose Hashdoop, a MapReduce framework that splits traffic with a hash function to preserve traffic structures and, hence, profits of distributed computing infrastructures to detect network anomalies. The benefits of Hashdoop are evaluated with two anomaly detectors and fifteen traces of Internet backbone traffic captured between 2001 and 2013. Using a 6-node cluster Hashdoop increased the throughput of the slowest detector with a speed-up of 15; thus, enabling real-time detection for the largest analyzed traces. Hashdoop also improved the overall detectors accuracy as splits emphasized anomalies by reducing the surrounding traffic.

International Journal of Network Management | 2010

Unsupervised host behavior classification from connection patterns

Guillaume Dewaele; Yosuke Himura; Pierre Borgnat; Kensuke Fukuda; Patrice Abry; Olivier Michel; Romain Fontugne; Kenjiro Cho; Hiroshi Esaki

A novel host behavior classification approach is proposed as a preliminary step toward traffic classification and anomaly detection in network communication. Although many attempts described in the literature were devoted to flow or application classifications, these approaches are not always adaptable to the operational constraints of traffic monitoring (expected to work even without packet payload, without bidirectionality, on high-speed networks or from flow reports only, etc.). Instead, the classification proposed here relies on the leading idea that traffic is relevantly analyzed in terms of host typical behaviors: typical connection patterns of both legitimate applications (data sharing, downloading, etc.) and anomalous (eventually aggressive) behaviors are obtained by profiling traffic at the host level using unsupervised statistical classification. Classification at the host level is not reducible to flow or application classification, and neither is the contrary: they are different operations which might have complementary roles in network management. The proposed host classification is based on a nine-dimensional feature space evaluating host Internet connectivity, dispersion and exchanged traffic content. A minimum spanning tree (MST) clustering technique is developed that does not require any supervised learning step to produce a set of statistically established typical host behaviors. Not relying on a priori defined classes of known behaviors enables the procedure to discover new host behaviors, that potentially were never observed before. This procedure is applied to traffic collected over the entire year of 2008 on a transpacific (Japan/USA) link. A cross-validation of this unsupervised classification against a classical port-based inspection and a state-of-the-art method provides assessment of the meaningfulness and the relevance of the obtained classes for host behaviors. Copyright

Computer Communications | 2013

ADMIRE: Anomaly detection method using entropy-based PCA with three-step sketches

Yoshiki Kanda; Romain Fontugne; Kensuke Fukuda; Toshiharu Sugawara

Network anomaly detection using dimensionality reduction has recently been well studied in order to overcome the weakness of signature-based detection. Previous works have proposed a method for detecting particular anomalous IP-flows by using random projection (sketch) and a Principal Component Analysis (PCA). It yields promising high detection capability results without needing a pre-defined anomaly database. However, the detection method cannot be applied to the traffic flows at a single measurement point, and the appropriate parameter settings (e.g., the relationship between the sketch size and the number of IP addresses) have not yet been sufficiently studied. We propose in this paper a PCA-based anomaly detection algorithm called ADMIRE to supplement and expand the previous works. The key idea of ADMIRE is the use of three-step sketches and an adaptive parameter setting to improve the detection performance and ease its use in practice. We evaluate the effectiveness of ADMIRE using the longitudinal traffic traces captured from a transpacific link. The main findings of this paper are as follows: (1) We reveal the correlation between the number of IP addresses in the measured traffic and the appropriate sketch size. We take advantage of this relation to set the sketch size parameter. (2) ADMIRE outperforms traditional PCA-based detector and other detectors based on different theoretical backgrounds. (3) The types of anomalies reported by ADMIRE depend on the traffic features that are selected as input. Moreover, we found that a simple aggregation of several traffic features degrades the detection performance.

Proceedings of the Special Workshop on Internet and Disasters | 2011

Disasters seen through Flickr cameras

Romain Fontugne; Kenjiro Cho; Youngjoon Won; Kensuke Fukuda

Collecting aftermath information after a wide-area disaster is a crucial task in the disaster response that requires important human resources. We propose to assist reconnaissance teams by extracting useful data sent by the users of social networks that experienced the disaster. In particular we consider the photo sharing website Flickr as a source of information that allows one to evaluate the disaster aftermath. We propose a methodology to detect major event occurrences from the behavior of Flickr users and describe the nature of these events from the tags they post on the Flickr website. Our experiments using two study cases, namely, the Tohoku earthquake and tsunami and the Tuscaloosa tornado, reveals the value of the data published by Flickr users and highlight the value of social networks in disaster response.

acm symposium on applied computing | 2011

A Hough-transform-based anomaly detector with an adaptive time interval

Romain Fontugne; Kensuke Fukuda

Internet traffic anomalies are a serious problem that compromise the availability of optimal network resources. Numerous anomaly detectors have recently been proposed, but maintaining their parameters optimally tuned is a difficult task that discredits their effectiveness for daily usage. This article proposes a new anomaly detection method based on pattern recognition and investigates the relationship between its parameter set and the traffic characteristics. This analysis highlights that constantly achieving a high detection rate requires continuous adjustments to the parameters according to the traffic fluctuations. Therefore, an adaptive time interval mechanism is proposed to enhance the robustness of the detection method to traffic variations. This adaptive anomaly detection method is evaluated by comparing it to three other anomaly detectors using four years of real backbone traffic. The evaluation reveals that the proposed adaptive detection method outperforms the other methods in terms of the true positive and false positive rate.

international conference on wireless communications and mobile computing | 2014

A taxonomy of anomalies in backbone network traffic

Johan Mazel; Romain Fontugne; Kensuke Fukuda

The potential threat of network anomalies on Internet has led to a constant effort by the research community to design reliable detection methods. Detection is not enough, however, because network administrators need additional information on the nature of events occurring in a network. Several works try to classify detected events or establish a taxonomy of known events. But, these works are non-overlapping in terms of anomaly type coverage. On the one hand, existing classification methods use a limited set of labels. On the other hand, taxonomies often target a single type of anomaly or, when they have wider scope, fail to present the full spectrum of what really happens in the wild. We thus present a new taxonomy of network anomalies with wide coverage of existing work. We also provide a set of signatures that assign taxonomy labels to events. We present a preliminary study applying this taxonomy with six years of real network traffic from the MAWI repository. We classify previously documented anomalous events and draw to main conclusions. First, the taxonomy-based analysis provides new insights regarding events previous classified by heuristic rule labeling. For example, some RST events are now classified as network scan response and the majority of ICMP events are split into network scans and network scan responses. Moreover, some previously unknown events now account for a substantial number of all UDP network scans, network scan responses and port scans. Second, the number of unknown events decreases from 20 to 10% of all events with the proposed taxonomy as compared to the heuristic approach.

IEEE ACM Transactions on Networking | 2017

Scaling in Internet Traffic: A 14 Year and 3 Day Longitudinal Study, With Multiscale Analyses and Random Projections

Romain Fontugne; Patrice Abry; Kensuke Fukuda; Darryl Veitch; Kenjiro Cho; Pierre Borgnat; Herwig Wendt

In the mid 1990s, it was shown that the statistics of aggregated time series from Internet traffic departed from those of traditional short range-dependent models, and were instead characterized by asymptotic self-similarity. Following this seminal contribution, over the years, many studies have investigated the existence and form of scaling in Internet traffic. This contribution first aims at presenting a methodology, combining multiscale analysis (wavelet and wavelet leaders) and random projections (or sketches), permitting a precise, efficient and robust characterization of scaling, which is capable of seeing through non-stationary anomalies. Second, we apply the methodology to a data set spanning an unusually long period: 14 years, from the MAWI traffic archive, thereby allowing an in-depth longitudinal analysis of the form, nature, and evolutions of scaling in Internet traffic, as well as network mechanisms producing them. We also study a separate three-day long trace to obtain complementary insight into intra-day behavior. We find that a biscaling (two ranges of independent scaling phenomena) regime is systematically observed: long-range dependence over the large scales, and multifractallike scaling over the fine scales. We quantify the actual scaling ranges precisely, verify to high accuracy the expected relationship between the long range dependent parameter and the heavy tail parameter of the flow size distribution, and relate fine scale multifractal scaling to typical IP packet inter-arrival and to round-trip time distributions.

internet measurement conference | 2017

Pinpointing delay and forwarding anomalies using large-scale traceroute measurements

Romain Fontugne; Cristel Pelsser; Emile Aben; Randy Bush

Understanding data plane health is essential to improving Internet reliability and usability. For instance, detecting disruptions in distant networks can identify repairable connectivity problems. Currently this task is difficult and time consuming as operators have poor visibility beyond their networks border. In this paper we leverage the diversity of RIPE Atlas traceroute measurements to solve the classic problem of monitoring in-network delays and get credible delay change estimations to monitor network conditions in the wild. We demonstrate a set of complementary methods to detect network disruptions and report them in near real time. The first method detects delay changes for intermediate links in traceroutes. Second, a packet forwarding model predicts traffic paths and identifies faulty routers and links in cases of packet loss. In addition, we define an alarm score that aggregates changes into a single value per AS in order to easily monitor its sanity, reducing the effect of uninteresting alarms. Using only existing public data we monitor hundreds of thousands of link delays while adding no burden to the network. We present three cases demonstrating that the proposed methods detect real disruptions and provide valuable insights, as well as surprising findings, on the location and impact of the identified events.

international conference on computer communications | 2015

An empirical mixture model for large-scale RTT measurements

Romain Fontugne; Johan Mazel; Kensuke Fukuda

Monitoring delays in the Internet is essential to understand the network condition and ensure the good functioning of time-sensitive applications. Large-scale measurements of round-trip time (RTT) are promising data sources to gain better insights into Internet-wide delays. However, the lack of efficient methodology to model RTTs prevents researchers from leveraging the value of these datasets. In this work, we propose a log-normal mixture model to identify, characterize, and monitor spatial and temporal dynamics of RTTs. This data-driven approach provides a coarse grained view of numerous RTTs in the form of a graph, thus, it enables efficient and systematic analysis of Internet-wide measurements. Using this model, we analyze more than 13 years of RTTs from about 12 millions unique IP addresses in passively measured backbone traffic traces. We evaluate the proposed method by comparison with external data sets, and present examples where the proposed model highlights interesting delay fluctuations due to route changes or congestion. We also introduce an application based on the proposed model to identify hosts deviating from their typical RTTs fluctuations, and we envision various applications for this empirical model.

Explore More