Taoufik En-Najjary | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Taoufik En-Najjary is active.

Explore More

Publication

Featured researches published by Taoufik En-Najjary.

internet measurement conference | 2007

A global view of kad

Moritz Steiner; Taoufik En-Najjary; Ernst W. Biersack

Distributed hash tables (DHTs) have been actively studied in literature and many different proposals have been made on how to organize peers in a DHT. However, very few DHT shave been implemented in real systems and deployed on alarge scale. One exception is <scp>KAD</scp>, a DHT based on Kademlia, which is part of eDonkey2000, a peer-to-peer file sharing system with several million simultaneous users. We have been crawling <scp>KAD</scp> continuously for about six months and obtained information about the total number of peers online and their geographical distribution. Peers are identified by the so called <scp>KAD</scp> ID, which was up to now assumed to remain the same across sessions. However, we observed that this is not the case: There is a large number of peers, in particular in China, that change their <scp>KAD</scp> ID, sometimes as frequently as after each session. This change of <scp>KAD</scp> IDs makes it difficult to characterize end-user availability or membership turnover.

IEEE ACM Transactions on Networking | 2009

Long term study of peer behavior in the KAD DHT

Moritz Steiner; Taoufik En-Najjary; Ernst W. Biersack

Distributed hash tables (DHTs) have been actively studied in literature and many different proposals have been made on how to organize peers in a DHT. However, very few DHTs have been implemented in real systems and deployed on a large scale. One exception is KAD, a DHT based on Kademlia, which is part of eDonkey, a peer-to-peer file sharing system with several million simultaneous users. We have been crawling a representative subset of KAD every five minutes for six months and obtained information about geographical distribution of peers, session times, daily usage, and peer lifetime. We have found that session times are Weibull distributed and we show how this information can be exploited to make the publishing mechanism much more efficient. Peers are identified by the so-called KAD ID, which up to now was assumed to be persistent. However, we observed that a fraction of peers changes their KAD ID as frequently as once a session. This change of KAD IDs makes it difficult to characterize end-user behavior. For this reason we have been crawling the entire KAD network once a day for more than a year to track end-users with static IP addresses, which allows us to estimate end-user lifetime and the fraction of end-users changing their KAD ID.

acm special interest group on data communication | 2007

Exploiting KAD: possible uses and misuses

Moritz Steiner; Taoufik En-Najjary; Ernst W. Biersack

Peer-to-peer systems have seen a tremendous growth in the last few years and peer-to-peer traffic makes a major fraction of the total traffic seen in the Internet. The dominating application for peer-to-peer is file sharing. Some of the most popular peer-to-peer systems for file sharing have been Napster, FastTrack, BitTorrent, and eDonkey, each one counting a million or more users at their peak time. We got interested in kad since it is the only DHT that has been part of very popular peer-to-peer system with several million simultaneous users. As we have been studying kad over the course of the last 18 months we have been both, fascinated and frightened by the possibilities kad offers. Mounting a Sybil attack is very easy in kad and allows to compromise the privacy of kad users, to compromise the correct operation of the key lookup and to mount DDOS with very little resources. In this paper, we will relate some of our findings and point out how kad can be used and misused.

internet measurement conference | 2009

Challenging statistical classification for operational usage: the ADSL case

Marcin Pietrzyk; Jean-Laurent Costeux; Guillaume Urvoy-Keller; Taoufik En-Najjary

Accurate identification of network traffic according to application type is a key issue for most companies, including ISPs. For example, some companies might want to ban p2p traffic from their network while some ISPs might want to offer additional services based on the application. To classify applications on the fly, most companies rely on deep packet inspection (DPI) solutions. While DPI tools can be accurate, they require constant updates of their signatures database. Recently, several statistical traffic classification methods have been proposed. In this paper, we investigate the use of these methods for an ADSL provider managing many Points of Presence (PoPs). We demonstrate that statistical methods can offer performance similar to the ones of DPI tools when the classifier is trained for a specific site. It can also complement existing DPI techniques to mine traffic that the DPI solution failed to identify. However, we also demonstrate that, even if a statistical classifier is very accurate on one site, the resulting model cannot be applied directly to other locations. We show that this problem stems from the statistical classifier learning site specific information.

conference on emerging network experiment and technology | 2007

Proactive replication in distributed storage systems using machine availability estimation

Alessandro Duminuco; Ernst W. Biersack; Taoufik En-Najjary

Distributed storage systems provide data availability by means of redundancy. To assure a given level of availability in case of node failures, new redundant fragments need to be introduced. Since node failures can be either transient or permanent, deciding when to generate new fragments is non-trivial. An additional difficulty is due to the fact that the failure behavior in terms of the rate of permanent and transient failures may vary over time. To be able to adapt to changes in the failure behavior, many systems adopt a reactive approach, in which new fragments are created as soon as a failure is detected. However, reactive approaches tend to produce spikes in bandwidth consumption. Proactive approaches create new fragments at a fixed rate that depends on the knowledge of the failure behavior or is given by the system administrator. However, existing proactive systems are not able to adapt to a changing failure behavior, which is common in real world. We propose a new technique based on an ongoing estimation of the failure behavior that is obtained using a model that consists of a network of queues. This scheme combines the adaptiveness of reactive systems with the smooth bandwidth usage of proactive systems, generalizing the two previous approaches. Now, the duality reactive or proactive becomes a specific case of a wider approach tunable with respect to the dynamics of the failure behavior.

conference on emerging network experiment and technology | 2008

Capacity estimation of ADSL links

Daniele Croce; Taoufik En-Najjary; Guillaume Urvoy-Keller; Ernst W. Biersack

Most tools designed to estimate the capacity of an Internet path require access on both end hosts of the path, which makes them difficult to deploy and use. In this paper we present a single-sided technique for measuring the capacity without the active cooperation of the destination host, focusing particularly on ADSL links. Compared to current methods used on broadband hosts, our approach generates two orders of magnitude less traffic and is much less intrusive. Our tool, DSLprobe, exploits the typical characteristics of ADSL, namely its bandwidth asymmetry and the relatively low absolute bandwidth, in order to measure both downlink and uplink capacities and to mitigate the impact of cross-traffic. To further improve the accuracy, we study different ways to detect and filter cross-traffic packets and we show how to recognize and overcome limited uplink capacities. We validate our tool both on controlled hosts and on a wide variety of Internet hosts. Finally, we present a case study of two large ADSL providers.

acm special interest group on data communication | 2008

Operational comparison of available bandwidth estimation tools

Guillaume Urvoy-Keller; Taoufik En-Najjary; Alessandro Sorniotti

The available bandwidth of a path directly impacts the performance of throughput sensitive applications, e.g., p2p content replication or podcasting. Several tools have been devised to estimate the available bandwidth. The vast majority of these tools follow either the Probe Rate Model (PRM) or the Probe Gap Model (PGM). Lao et al. [6] and Liu et al. [7] have identified biases in the PGM approach that lead to consistent underestimations of the available bandwidth. Those results were obtained under the ideal assumption of stationary cross traffic. In this note, we confirm the existence of these biases experimentally, i.e., for the case of non stationary cross traffic. To do so, we compare one representative of the PRM family, namely Pathload, and one representative of the PGM family, namely Spruce, using long term (several day long) traces collected on an example path. We first propose a methodology to compare operational results of two available bandwidth measurement tools. Based on the sanitized data obtained using the previous methodology, we next show that the biases identified by previous works are clearly observable on the long term, even with non stationary cross traffic. We further uncover the formal link that exists between the work by Liu et al. and the one by Lao et al.

conference on emerging network experiment and technology | 2005

Root cause analysis for long-lived TCP connections

Matti Siekkinen; Guillaume Urvoy-Keller; Ernst W. Biersack; Taoufik En-Najjary

While the applications using the Internet have changed over time, TCP is still the dominating transport protocol that carries over 90% of the total traffic. Throughput is the key performance metric for long TCP connections. The achieved throughput results from the aggregate effects of the network path, the parameters of the TCP end points, and the application on top of TCP. Finding out which of these factors is limiting the throughput of a TCP connection -- referred to as TCP root cause analysis -- is important for end users that want to understand the origins of their problems, ISPs that need to troubleshoot their network, and application designers that need to know how to interpret the performance of the application. In this paper, we revisit TCP root cause analysis by first demonstrating the weaknesses of a previously proposed flight-based approach. We next discuss in detail the different possible limitations and highlight the need to account for the application behavior during the analysis process. The main contribution of this paper is a new approach based on the analysis of time series extracted from packet traces. These time series allow for a quantitative assessment of the different causes with respect to the resulting throughput. We demonstrate the interest of our approach on a large BitTorrent dataset.

international conference on detection of intrusions and malware and vulnerability assessment | 2008

The Quest for Multi-headed Worms

Van-Hau Pham; Marc Dacier; Guillaume Urvoy-Keller; Taoufik En-Najjary

In [6], Pouget et al. have conjectured the existence of so-called multi-headed worms and found a couple of them on attack traces collected on a single honeypot. These worms take advantage of several distinct attack techniques to propagate but they use only one of them against a given target. From a victims viewpoint, they are therefore indistinguishable from the other classical worms that always propagate using the same attack vector or same sequence of attack vectors. This paper aims at confirming the existence of these worms by studying a very large dataset. The validation process led to three important contributions. First, we establish the existence and assess the importance of three distinct classes of attacks seen in the wild. Second, we propose a new method to correlate attack traces time series and apply it to search for multi-headed worms. Third, we offer and discuss results of the analysis of 15 months of data gathered over 28 different platforms located all over the world.

international conference on computer communications | 2008

Non-cooperative available bandwidth estimation towards ADSL links

Daniele Croce; Taoufik En-Najjary; Guillaume Urvoy-Keller; Ernst W. Biersack

Existing tools for the estimation of the end- to-end available bandwidth require control of both end hosts of the path and this significantly limits their usability. In this paper we present ABwProbe, a single-ended tool for available bandwidth estimation against non-cooperative hosts. Although ABwProbe is general enough to be used on any Internet path, we focus our attention on ADSL links exploring the possibility of measuring the downlink available bandwidth of a non-cooperative ADSL host. We study the effect of cross-traffic on the uplink, finding that only large packets may deteriorate ABwProbes measurements and we present two techniques to detect and filter the effect of uplink cross-traffic.

Explore More