Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Frank Olaf Sem-Jacobsen is active.

Publication


Featured researches published by Frank Olaf Sem-Jacobsen.


international symposium on system-on-chip | 2009

Flexible DOR routing for virtualization of multicore chips

Tor Skeie; Frank Olaf Sem-Jacobsen; Samuel Rodrigo; Jose Flich; Davide Bertozzi; Simone Medardoni

The expected increase in number of cores on a single chip leads to the necessity of high-performance on chip interconnects (NoC). Furthermore, in order to fully utilize the abundance of cores, the chip is expected to support a number of applications running on the chip simultaneously. It is therefore necessary to partition the chip to support numerous applications without any risk of interference between them. The success of this depends on the flexibility of the underlying routing algorithm. This paper presents a flexible routing algorithm based on dimension ordered routing, which supports a large variety of irregular (2-D and 3-D) mesh topologies. The algorithm provides high efficiency at very low additional complexity, as is confirmed by experimental results.


international parallel and distributed processing symposium | 2005

Siamese-twin: a dynamically fault-tolerant fat-tree

Frank Olaf Sem-Jacobsen; Tor Skeie; Olav Lysne; O. Toerudbakken; E. Rongved; Bjørn Dag Johnsen

Fat-trees are a special case of multistage interconnection networks with quite good static fault tolerance capabilities. They are however straightforwardly unable to provide local dynamic fault tolerance. In this paper we propose a network topology based on the fat-tree using two parallel networks with crossover links between them in an effort to enable dynamic fault tolerance. We evaluate and compare this topology with two other similar fat-tree topologies and show through simulations that the new topology is able to improve slightly upon the ability to tolerate faults statically. More importantly, we show that the new network topology is the only one of the evaluated topologies able to tolerate one fault dynamically, with a superior network performance in the face of dynamically handled faults.


cluster computing and the grid | 2012

Topology Agnostic Dynamic Quick Reconfiguration for Large-Scale Interconnection Networks

Frank Olaf Sem-Jacobsen; Olav Lysne

Toleration of faults in the interconnection networks is of vital importance in to days huge computer installations. Still, the existing solutions are short of being satisfactory. They require that the system defaults into a routing algorithm that is inferior to the original, either in terms of performance, or in terms of the need for virtual channels, or both. Furthermore, since support for dynamic reconfiguration is not supported in current hardware, existing methods require the system to be halted while reconfiguration takes place in order to avoid deadlocks. In this paper we present a method that efficiently generates a new routing function in the presence of faults. The new routing function only reroutes the traffic that is affected by the fault, so that the performance of the original routing function is preserved to the extent possible. No specific functionality in the switches is required, we only require exactly the same number of virtual channels in the presence of faults as the original routing algorithm did. Finally, the new routing function is compatible with the old one, so that deadlock free dynamic transition between the old and the new routing function is immediately available. This means that our solution can easily be implemented on current InfiniBand platforms, e.g. through the OFED software stack. We demonstrate that the method is workable for meshes, tori and fat-trees, and that it is able to guarantee one-fault tolerance for all of these topologies.


IEEE Transactions on Computers | 2011

Dynamic Fault Tolerance in Fat Trees

Frank Olaf Sem-Jacobsen; Tor Skeie; Olav Lysne; José Duato

Fat trees are a very common communication architecture in current large-scale parallel computers. The probability of failure in these systems increases with the number of components. We present a routing method for deterministically and adaptively routed fat trees, applicable to both distributed and source routing, that is able to handle several concurrent faults and that transparently returns to the original routing strategy once the faulty components have recovered. The method is local and dynamic, completely masking the fault from the rest of the system. It only requires a small extra functionality in the switches to handle rerouting packets around a fault. The method guarantees connectedness and deadlock and livelock freedom for up to k -1 benign simultaneous switch and/or link faults where k is half the number of ports in the switches. Our simulation experiments show a graceful degradation of performance as more faults occur. Furthermore, we demonstrate that for most fault combinations, our method will even be able to handle significantly more faults beyond the k -1 limit with high probability.


Proceedings of the Fifth International Workshop on Interconnection Network Architecture | 2011

iFDOR: dynamic rerouting on-chip

Frank Olaf Sem-Jacobsen; Samuel Rodrigo; Tor Skeie

Many-core chip design requires flexible routing solutions for the interconnect to handle faults, provide performance partitions, and react to dynamic changes in processing requirements and power/heat distribution. We have developed a logic based rerouting mechanism suitable for tolerating dynamic powering down of regions within the application partition on the chip. This mechanism is combined with the logic based FDOR routing algorithm to create a powerful routing algorithm with low implementation cost. This allows for higher system utilisation through enabling more efficient power management as well as supporting many irregular mesh topologies through flexible virtualisation. Results show that powering down a single switch results in an 8% throughput reduction in the worst case for the evaluated topology.


international conference on parallel processing | 2006

Dynamic Fault Tolerance with Misrouting in Fat Trees

Frank Olaf Sem-Jacobsen; Tor Skeie; Olav Lysne; José Duato

Fault tolerance is critical for efficient utilisation of large computer systems. Dynamic fault tolerance allows the network to remain available through the occurance of faults as opposed to static fault tolerance which requires the network to be halted to reconfigure it. Although dynamic fault tolerance may lead to less efficient solutions than static fault tolerance, it allows for a much higher availability of the system. In this paper we devise a dynamic fault tolerant adaptive routing algorithm for the fat tree, a much used interconnect topology, which relies on misrouting around link faults. We show that we are guaranteed to tolerate any combination of less than (num_switch_ports)/2 link faults without the need for additional network resources for deadlock freedom. There is also a high probability of tolerating an even larger number of link faults. Simulation results show that network performance degrades very little when faults are dynamically tolerated


international conference on parallel and distributed systems | 2010

Achieving Predictable High Performance in Imbalanced Fat Trees

Bartosz Bogdanski; Frank Olaf Sem-Jacobsen; Sven-Arne Reinemo; Tor Skeie; Line Holen; Lars Paul Huse

The fat-tree topology has become a popular choice for InfiniBand fabrics due to its inherent deadlock freedom, fault-tolerance and full bisection bandwidth. InfiniBand is used by more than 40% of the systems on the latest Top 500 list, and many of these systems are based on a fat-tree topology. However, the current InfiniBand fat-tree routing algorithm suffers from flaws that reduce its scalability and flexibility. Counter-intuitively, the achievable throughput per node deteriorates both when the number of nodes in a tree decreases or when the node distribution among leaves is nonuniform. In this paper, we identify the weaknesses of the current enhanced fat-tree routing algorithm in Open Fabrics Enterprise Distribution and we propose extensions to it that alleviate all performance problems related to node distribution. The new algorithm is implemented in OpenSM for real world evaluation and for future contribution to the Open Fabrics community. We demonstrate that our solution allows to achieve a predictable high throughput regardless of the number of nodes and their distribution. Furthermore, the simulations show that our extensions improve throughput up to 30% depending on topology size and node distribution.


high performance embedded architectures and compilers | 2012

sFtree: A fully connected and deadlock-free switch-to-switch routing algorithm for fat-trees

Bartosz Bogdanski; Sven-Arne Reinemo; Frank Olaf Sem-Jacobsen; Ernst Gunnar Gran

Existing fat-tree routing algorithms fully exploit the path diversity of a fat-tree topology in the context of compute node traffic, but they lack support for deadlock-free and fully connected switch-to-switch communication. Such support is crucial for efficient system management, for example, in InfiniBand (IB) systems. With the general increase in system management capabilities found in modern InfiniBand switches, the lack of deadlock-free switch-to-switch communication is a problem for fat-tree-based IB installations because management traffic might cause routing deadlocks that bring the whole system down. This lack of deadlock-free communication affects all system management and diagnostic tools using LID routing. In this paper, we propose the sFtree routing algorithm that guarantees deadlock-free and fully connected switch-to-switch communication in fat-trees while maintaining the properties of the current fat-tree algorithm. We prove that the algorithm is deadlock free and we implement it in OpenSM for evaluation. We evaluate the performance of the sFtree algorithm experimentally on a small cluster and we do a large-scale evaluation through simulations. The results confirm that the sFtree routing algorithm is deadlock-free and show that the impact of switch-to-switch management traffic on the end-node traffic is negligible.


symposium on computer architecture and high performance computing | 2006

Combining Source Routing and Dynamic Fault Tolerance

Frank Olaf Sem-Jacobsen; Olav Lysne; Tor Skeie

An increasing amount of interconnect technologies rely on source routing to forward packets through the network. It is therefore important to develop methods for fault tolerance that are well suited for source routed networks. Dynamic fault tolerance allows the network to remain available through the occurrence of faults, as opposed to static fault tolerance which requires the network to be halted to reconfigure it. Source routing readily supports the source node choosing a different path when a fault occurs, but using this approach, packets already in the network will be lost. Local dynamic fault tolerance, where the packet is routed around the fault locally, would prevent much of the traffic being lost during failures, but this is cumbersome to achieve in source routed networks since packets encountering a fault will need to follow a path different from that encoded in the packet header. In this paper we present a mechanism to achieve local dynamic fault tolerance in source routed fat trees, a topology that has widespread use in supercomputer systems, and compare it with endpoint dynamic fault tolerance. We also show that by combining the two approaches we achieve performance superior to any of the two individually


international parallel and distributed processing symposium | 2008

Fault tolerance with shortest paths in regular and irregular networks

Frank Olaf Sem-Jacobsen; Olav Lysne

Fault tolerance has become an important part of current supercomputers. Local dynamic fault tolerance is the most expedient way of tolerating faults by preconfiguring the network with multiple paths from every node/switch to every destination. In this paper we present a local shortest path dynamic fault-tolerance mechanism inspired by a solution developed for the Internet, that can be applied to any shortest path routing algorithm such as dimension ordered routing, fat tree routing, layered shortest path, etc., and provide a solution for achieving deadlock freedom in the presence of faults. Simulation results show that 1) for fat trees this yields the to this day highest throughput and lowest requirements on virtual layers for dynamic one-fault tolerance, 2) we require in general few layers to achieve deadlock freedom, and 3) for irregular topologies it gives at most a 10 times performance increase compared to FRoots.

Collaboration


Dive into the Frank Olaf Sem-Jacobsen's collaboration.

Top Co-Authors

Avatar

Tor Skeie

Simula Research Laboratory

View shared research outputs
Top Co-Authors

Avatar

Olav Lysne

Simula Research Laboratory

View shared research outputs
Top Co-Authors

Avatar

Samuel Rodrigo

Simula Research Laboratory

View shared research outputs
Top Co-Authors

Avatar

Sven-Arne Reinemo

Simula Research Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jose Flich

Polytechnic University of Valencia

View shared research outputs
Top Co-Authors

Avatar

José Duato

Polytechnic University of Valencia

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge