Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sven-Arne Reinemo is active.

Publication


Featured researches published by Sven-Arne Reinemo.


international parallel and distributed processing symposium | 2006

Segment-based routing: an efficient fault-tolerant routing algorithm for meshes and tori

Andres Mejia; Jose Flich; José Duato; Sven-Arne Reinemo; Tor Skeie

Computers get faster every year, but the demand for computing resources seems to grow at an even faster rate. Depending on the problem domain, this demand for more power can be satisfied by either, massively parallel computers, or clusters of computers. Common for both approaches is the dependence on high performance interconnect networks such as Myrinet, Infiniband, or 10 Gigabit Ethernet. While high throughput and low latency are key features of interconnection networks, the issue of fault-tolerance is now becoming increasingly important. As the number of network components grows so does the probability for failure, thus it becomes important to also consider the fault-tolerance mechanism of interconnection networks. The main challenge then lies in combining performance and fault-tolerance, while still keeping cost and complexity low. This paper proposes a new deterministic routing methodology for tori and meshes, which achieves high performance without the use of virtual channels. Furthermore, it is topology agnostic in nature, meaning it can handle any topology derived from any combination of faults when combined with static reconfiguration. The algorithm, referred to as segment-based routing (SR), works by partitioning a topology into subnets, and subnets into segments. This allows us to place bidirectional turn restrictions locally within a segment. As segments are independent, we gain the freedom to place turn restrictions within a segment independently from other segments. This results in a larger degree of freedom when placing turn restrictions compared to other routing strategies. In this paper a way to compute segment-based routing tables is presented and applied to meshes and tori. Evaluation results show that SR increases performance by a factor of 1.8 over FX and up*/down* routing


IEEE Transactions on Parallel and Distributed Systems | 2006

Layered routing in irregular networks

Olav Lysne; Tor Skeie; Sven-Arne Reinemo; Ingebjørg Theiss

Freedom from deadlock is a key issue in cut-through, wormhole, and store and forward networks, and such freedom is usually obtained through careful design of the routing algorithm. Most existing deadlock-free routing methods for irregular topologies do, however, impose severe limitations on the available routing paths. We present a method called layered routing, which gives rise to a series of routing algorithms, some of which perform considerably better than previous ones. Our method groups virtual channels into network layers and to each layer it assigns a limited set of source/destination address pairs. This separation of traffic yields a significant increase in routing efficiency. We show how the method can be used to improve the performance of irregular networks, both through load balancing and by guaranteeing shortest-path routing. The method is simple to implement, and its application does not require any features in the switches other than the existence of a modest number of virtual channels. The performance of the approach is evaluated through extensive experiments within three classes of technologies. These experiments reveal a need for virtual channels as well as an improvement in throughput for each technology class.


international parallel and distributed processing symposium | 2010

First experiences with congestion control in InfiniBand hardware

Ernst Gunnar Gran; Magne Eimot; Sven-Arne Reinemo; Tor Skeie; Olav Lysne; Lars Paul Huse; Gilad Shainer

In lossless interconnection networks congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. Without CC, congestion in one node may grow into a congestion tree that can degrade the performance severely. This degradation can affect not only contributors to the congestion, but also throttles innocent traffic flows in the network. The InfiniBand standard describes CC functionality for detecting and resolving congestion. The InfiniBand CC concept is rich in the way that it specifies a set of parameters that can be tuned in order to achieve effective CC. There is, however, limited experience with the InfiniBand CC mechanism. To the best of our knowledge, only a few simulation studies exist. Recently, InfiniBand CC has been implemented in hardware, and in this paper we present the first experiences with such equipment. We show that the implemented InfiniBand CC mechanism effectively resolves congestion and improves fairness by solving the parking lot problem, if the CC parameters are appropriately set. By conducting extensive testing on a selection of the CC parameters, we have explored the parameter space and found a subset of parameter values that leads to efficient CC for our test scenarios. Furthermore, we show that the InfiniBand CC increases the performance of the well known HPC Challenge benchmark in a congested network.


international parallel and distributed processing symposium | 2011

vFtree - A Fat-Tree Routing Algorithm Using Virtual Lanes to Alleviate Congestion

Wei Lin Guay; Bartosz Bogdanski; Sven-Arne Reinemo; Olav Lysne; Tor Skeie

It is a well known fact that multiple virtual lanes can improve performance in interconnection networks, but this knowledge has had little impact on real clusters. Currently, a large number of clusters using InfiniBand is based on fat-tree topologies that can be routed deadlock-free using only one virtual lane. Consequently, all the remaining virtual lanes are left unused. In this paper we suggest an enhancement to the fat-tree algorithm that utilizes virtual lanes to improve performance when hot-spots are present. Even though the bisection bandwidth in a fat-tree is constant, hot-spots are still possible and they will degrade performance for flows not contributing to them due to head-of-line blocking. Such a situation may be alleviated through adaptive routing or congestion control, however, these methods are not yet readily available in InfiniBand technology. To remedy this problem, we have implemented an enhanced fat-tree algorithm in OpenSM that distributes traffic across all available virtual lanes without any configuration needed. We evaluated the performance of the algorithm on a small cluster and did a large-scale evaluation through simulations. In a congested environment, results show that we are able to achieve throughput increases up to 38\% on a small cluster and from 221\% to 757\% depending on the hot-spot scenario for a 648-port simulated cluster.


IEEE Communications Magazine | 2006

An overview of QoS capabilities in infiniband, advanced switching interconnect, and ethernet

Sven-Arne Reinemo; Tor Skeie; Thomas Sødring; Olav Lysne; O. Trudbakken

A recent trend in interconnection network technologies is the inclusion of various mechanisms to support a variety of quality of service (QoS) concepts. This has been necessitated by an increasing number of application areas that require some level of performance guarantees from the network for parts of its traffic. In this article we describe and compare the capabilities and support for the QoS of three of the most important interconnection network technology standards of today. Equalities between the technologies are explained and differences are clarified.


Journal of Parallel and Distributed Computing | 2014

A new proposal to deal with congestion in InfiniBand-based fat-trees

Jesús Escudero-Sahuquillo; Pedro Javier García; Francisco J. Quiles; Sven-Arne Reinemo; Tor Skeie; Olav Lysne; José Duato

The overall performance of High-Performance Computing applications may depend largely on the performance achieved by the network interconnecting the end-nodes; thus high-speed interconnect technologies like InfiniBand are used to provide high throughput and low latency. Nevertheless, network performance may be degraded due to congestion; thus using techniques to deal with the problems derived from congestion has become practically mandatory. In this paper we propose a straightforward congestion-management method suitable for fat-tree topologies built from InfiniBand components. Our proposal is based on a traffic-flow-to-service-level mapping that prevents, as much as possible with the resources available in current InfiniBand components (basically Virtual Lanes), the negative impact of the two most common problems derived from congestion: head-of-line blocking and buffer-hogging. We also provide a mathematical approach to analyze the efficiency of our proposal and several ones, by means of a set of analytical metrics. In certain traffic scenarios, we observe up to a 68% of the ideal performance gain that could be achieved in HoL-blocking and buffer-hogging prevention. Cost-efficient network-interconnect designs are a critical task for the HPC Systems.Congestion degrades the network performance: congestion management (CM) is required.InfiniBand(IB)-based interconnection networks have a strong presence in the HPC Systems.Flow2SL is a new CM technique for IB Fat-trees, based on mapping traffic-flows to SLs.Flow2SL achieves up to a 68% of improvement compared to the ideal performance gain.


ieee/acm international symposium cluster, cloud and grid computing | 2011

On the Relation between Congestion Control, Switch Arbitration and Fairness

Ernst Gunnar Gran; Eitan Zahavi; Sven-Arne Reinemo; Tor Skeie; Gilad Shainer; Olav Lysne

In loss less interconnection networks such as InfiniBand, congestion control (CC) can be an effective mechanism to achieve high performance and good utilization of network resources. The InfiniBand standard describes CC functionality for detecting and resolving congestion, but the design decisions on how to implement this functionallity is left to the hardware designer. One must be cautious when making these design decisions not to introduce fairness problems, as our study shows. In this paper we study the relationship between congestion control, switch arbitration, and fairness. Specifically, we look at fairness among different traffic flows arriving at a hot spot switch on different input ports, as CC is turned on. In addition we study the fairness among traffic flows at a switch where some flows are exclusive users of their input ports while other flows are sharing an input port (the parking lot problem). Our results show that the implementation of congestion control in a switch is vulnerable to unfairness if care is not taken. In detail, we found that a threshold hysteresis of more than one MTU is needed to resolve arbitration unfairness. Furthermore, to fully solve the parking lot problem, proper configuration of the CC parameters are required.


network aware data management | 2011

dFtree: a fat-tree routing algorithm using dynamic allocation of virtual lanes to alleviate congestion in infiniband networks

Wei Lin Guay; Sven-Arne Reinemo; Olav Lysne; Tor Skeie

End-point hotspots can cause major slowdowns in interconnection networks due to head-of-line blocking and congestion. Therefore, avoiding congestion is important to ensure high performance for the network traffic. It is especially important in situations where permanent congestion, which results in permanent slowdown, can occur. Permanent congestion occurs when traffic has been moved away from a failed link, when multiple jobs run on the same system, and compete for network resources, or when a system is not balanced for the application that runs on it. In this paper we suggest a mechanism for dynamic allocation of virtual lanes and live optimization of the distribution of flows between the allocated virtual lanes. The purpose is to alleviate the negative effect of permanent congestion by separating network flows into slow lane and fast lane traffic. Flows destined for a end-point hot-spot is placed in the slow lane and all other flows are placed in the fast lane. Consequently, the flows in the fast lane are unaffected by the head-of-line blocking created by the hot-spot traffic. We demonstrate the feasibility of this approach using a modified version of OFED and OpenSM with fat-tree routing on a small InfiniBand cluster. Our experiments show an increase in throughput ranging from 150% to 468% compared to the conventional fat-tree algorithm in OFED.


international parallel and distributed processing symposium | 2012

Exploring the Scope of the InfiniBand Congestion Control Mechanism

Ernst Gunnar Gran; Sven-Arne Reinemo; Olav Lysne; Tor Skeie; Eitan Zahavi; Gilad Shainer

In a loss less interconnection network, network congestion needs to be detected and resolved to ensure high performance and good utilization of network resources at high network load. If no countermeasure is taken, congestion at a node in the network will stimulate the growth of a congestion tree that not only affects contributors to congestion, but also other traffic flows in the network. Left untouched, the congestion tree will block traffic flows, lead to underutilization of network resources and result in a severe drop in network performance. The InfiniBand standard specifies a congestion control (CC) mechanism to detect and resolve congestion before a congestion tree is able to grow and, by that, hamper the network performance. The InfiniBand CC mechanism includes a rich set of parameters that can be tuned in order to achieve effective CC. Even though it has been shown that the CC mechanism, properly tuned, is able to improve both throughput and fairness in an interconnection network, it has been questioned whether the mechanism is fast enough to keep up with dynamic network traffic, and if a given set of parameter values for a topology is robust when it comes to different traffic patterns, or if the parameters need to be tuned depending on the applications in use. In this paper we address both these questions. Using the three-stage fat-tree topology from the Sun Data center InfiniBand Switch 648 as a basis, and a simulator tuned against CC capable InfiniBand hardware, we conduct a systematic study of the efficiency of the InfiniBand CC mechanism as the network traffic becomes increasingly more dynamic. Our studies show that the InfiniBand CC, even when using a single set of parameter values, performs very well as the traffic patterns becomes increasingly more dynamic, outperforming a network without CC in all cases. Our results show throughput increases varying from a few percent, to a seventeen-fold increase.


international symposium on microarchitecture | 2010

Ethernet for High-Performance Data centers: On the New IEEE Datacenter Bridging Standards

Sven-Arne Reinemo; Tor Skeie; Manoj K. Wadekar

Through the Datacenter Bridging task group, IEEE will add four supplements to the 802.1 standard that will both close the performance gap between Ethernet and InfiniBand and make the converged network a reality. In a converged network, all applications use a single physical infrastructure, which is ideal for the emerging next generation of data centers.

Collaboration


Dive into the Sven-Arne Reinemo's collaboration.

Top Co-Authors

Avatar

Tor Skeie

Simula Research Laboratory

View shared research outputs
Top Co-Authors

Avatar

Olav Lysne

Simula Research Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ernst Gunnar Gran

Simula Research Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Thomas Sødring

Simula Research Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge