Is this you? Create Your Porfile

J. Duato

Polytechnic University of Valencia

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where J. Duato is active.

Explore More

Publication

Featured researches published by J. Duato.

international conference on parallel processing | 2001

Deadlock-free routing in InfiniBand/sup TM/ through destination renaming

Pedro López; Jose Flich; J. Duato

The InfiniBand Architecture (IBA) defines a switch-based network with point-to-point links that supports any topology defined by the user including irregular ones, in order to provide flexibility and incremental expansion capability. Routing in IBA is distributed, based on forwarding tables, and only considers the packet destination ID for routing within subnets in order to drastically reduce forwarding table size. Unfortunately, the forwarding tables for most of the previously proposed routing algorithms for irregular topologies consider both the destination ID and the input channel. Therefore, these popular routing algorithms for irregular topologies may not be usable in InfiniBand networks because they do nor conform to the IBA specifications. In this paper we propose an easy-to-implement strategy to adapt the forwarding tables already computed following any routing algorithm that considers the destination ID and the input channel into the required IBA forwarding table format. The resulting routing algorithm is deadlock-free on IBA. Indeed, the originally computed paths are not modified at all. Hence, the proposed strategy does not degrade performance with respect to the original routing scheme.

international conference on parallel processing | 2006

On the influence of the selection function on the performance of fat-trees

F. Gilabert; María Engracia Gómez; Pedro López; J. Duato

Fat-tree topology has become very popular among switch manufacturers. Routing in fat-trees is composed of two phases, an adaptive upwards phase, and a deterministic downwards phase. The unique downwards path to the destination depends on the switch that has been reached in the upwards phase. As adaptive routing is used in the ascending phase, several output ports are possible at each switch and the final choice depends on the selection function. The impact of the selection function on performance has been previously studied for direct networks and has not resulted to be very important. In fat-trees, the decisions made in the upwards phase by the selection function can be critical, since it determines the switch reached in the upwards phase, and therefore the unique downwards path to the destination. In this paper, we analyze the effect of the selection function on fat-trees. Several selection functions are defined, compared and evaluated. The evaluation shows that selection function has a great impact on fat-trees.

international conference on parallel processing | 2006

A scalable synchronization technique for distributed virtual environments based on networked-server architectures

P. Morillo; J.M. Ordufia; J. Duato

Large scale distributed virtual environments have become a major trend in distributed applications, mainly due to the enormous popularity of multi-player online games in the entertainment industry. Thus, scalability has become an essential issue for these highly interactive systems. In this paper, we propose a new synchronization technique for those distributed virtual environments that are based on networked-server architectures. Unlike other methods described in the literature, the proposed technique takes into account the updating messages exchanged by avatars, thus releasing the servers from updating the location of such avatars when synchronizing the state of the system. As a result, the communications required for synchronization are greatly reduced, and this method results more scalable. Also, these communications are distributed along the whole synchronization period, in order to reduce workload peaks. Performance evaluation results show that the proposed approach significantly reduces the percentage of CPU utilization in the servers when compared with other existing methods, therefore supporting a higher number of avatars. Additionally, the system response time is reduced accordingly

international conference on parallel processing | 2006

RECN-DD: A Memory-Efficient Congestion Management Technique for Advanced Switching

Pedro Javier García; Francisco J. Quiles; Jose Flich; J. Duato; I. Johnson

As VLSI technology advances, the interconnection network represents a larger percentage of the total system cost and power consumption. In fact, a current trend in network design is to reduce the number of components. However, this leads to systems working closer to saturation point, and therefore an efficient congestion management technique is required. In that sense, RECN has been recently proposed for advanced switching (AS). RECN detects the formation of congestion trees and dynamically allocates queues for storing congested packets, thus, eliminating the HOL blocking introduced by congestion trees. These queues are deallocated when congestion vanishes. We have identified two shortcomings that may affect RECN scalability and implementation. Firstly, although RECN allocates queues in an efficient way, resource deallocation is performed in-order, thus losing efficiency and wasting resources. This leads to an excessive requirement of memory at switch ports. Secondly, both allocation and deallocation mechanisms involve the use of specific control packets not supported by the AS standard, thus preventing RECN implementation. In this sense we provide a detailed description of the current RECN deallocation mechanism. In this paper we present an enhanced RECN version (RECN-DD) where these problems have been eliminated. Specifically, we propose a new distributed queue deallocation mechanism that reduces the number of required resources and does not require the use of control packets. Moreover, we propose a new congestion notification mechanism that does not require non-standard AS packets. Instead, flow control packets are used to notify congestion, thus simplifying the implementation of RECN-DD in AS

international conference on parallel processing | 2012

Towards an efficient fat-tree like topology

D. Bermúdez Garzón; C. Gómez; María Engracia Gómez; Pedro López; J. Duato

Topology and routing algorithm are two key design parameters for interconnection networks. They highly define the performance achieved by the network, but also its complexity and cost. Many of the commodity interconnects for clusters are based on the fat---tree topology, which allows both a rich interconnection among the nodes of the network and the use of adaptive routing. In this paper, we analyze how the routing algorithm affects the complexity of the switch, and considering this, we also propose and analyze some extensions of the fat---tree topology to take advantage of the available hardware resources. We analyze not only the impact on performance of these extensions but also their influence over switch complexity, analyzing its cost.

international conference on parallel processing | 2003

Routing in infiniBand/spl trade/ torus network topologies

José Carlos Sancho; Antonio Robles; Pedro López; Jose Flich; J. Duato

InfiniBand is an interconnect standard for communication between processing nodes and I/O devices as well as for interprocessor communication (NOWs). The InfiniBand architecture (IBA) defines a switch-based network with point-to-point links whose topology can be established by the customer. When the performance is the primary concern regular topologies are preferred. Low-dimensional tori (2D and 3D) are some of the regular topologies most widely used in commercial parallel computers. Routing in torus requires the use of virtual channels. Although InfiniBand provides support for deterministic routing and virtual channels, they are selected at each switch by service level (SL) identifiers associated to packets and do not depend on packet destination. This makes routing algorithm implementation more complex. In particular, a large number of SLs may be required, which is a scarce resource. We analyze the way several routing strategies can be applied in tori InfiniBand networks, also evaluating their resource requirements. In particular, we analyze and compare the well-known e-cube and up*/down* routing algorithms and the flexible routing algorithm recently proposed

parallel, distributed and network-based processing | 2014

FT-RUFT: A Performance and Fault-Tolerant Efficient Indirect Topology

D. Bermudez Garzon; María Engracia Gómez; Pedro López; J. Duato; C. Gómez

Although performance is a key design issue of interconnection networks, fault-tolerance is becoming more important due to the large amount of components of large machines. In this paper, we focus on designing a simple indirect topology with both good performance and fault-tolerance properties. The idea is to take full advantage of the network resources consumed by the topology. To do that, starting from the RUFT topology, which is a simple UMIN topology that does not tolerate any link fault, we first duplicate injection and ejection links connecting these extra links in a particular way. The resulting topology tolerates 3 network link faults and also slightly increases performance with marginal increase in the network hardware cost. Most important, contrary to most of the available topologies, the topology is able to tolerate also faults in the links that connect to end-nodes. We also propose another topology that also duplicates network links, achieving 2x performance improvements and tolerating up to 7 network link faults. These results are better than the ones obtained by a BMIN with a similar amount of resources.

european conference on parallel processing | 2002

Evaluation of Routing Algorithms for InfiniBand Networks

María Engracia Gómez; Jose Flich; Antonio Robles; Pedro López; J. Duato

Storage Area Networks (SANs) provide the scalability required by the IT servers. InfiniBand (IBA) interconnect is very likely to become the de facto standard for SANs as well as for NOWs. The routing algorithm is a key design issue in irregular networks. Moreover, as several virtual lanes can be used and different network issues can be considered, the performance of the routing algorithms may be affected. In this paper we evaluate three existing routing algorithms (up*/down*, DFS, and smart-routing) suitable for being applied to IBA. Evaluation has been performed by simulation under different synthetic traffic patterns and I/O traces. Simulation results show that the smart-routing algorithm achieves the highest performance.

The Journal of Supercomputing | 2015

A HoL-blocking aware mechanism for selecting the upward path in fat-tree topologies

Crispín Gómez; F. Gilabert; María Engracia Gómez; Pedro López; J. Duato

Large cluster-based machines require efficient high-performance interconnection networks. Routing is a key design issue of interconnection networks. Adaptive routing usually outperforms deterministic routing at the expense of introducing out-of-order packet delivery. Many of the commodity interconnects for clusters are based on fat-trees. The adaptive routing algorithm commonly used in fat-trees is composed of a fully adaptive upward subpath, followed by a deterministic downward subpath. As the latter is determined by the former, choosing the most adequate upward path for each packet is critical in fat-trees to achieve a good performance. In this paper, we present a mechanism for selecting the upward path in fat-trees, which enables optimum use of the available network resources to achieve a high network throughput. The proposed path selection is destination based, which allows reducing the head-of-line blocking effect. Indeed, the proposed mechanism can be used either as a selection function (the provided path is used as the preferred one), or as a deterministic routing algorithm (the path is the only possible one). The results show that the resulting selection function outperforms any other known one. Moreover, the proposed deterministic routing algorithm can achieve a similar, or even higher, level of performance than adaptive routing, while providing in-order packet delivery and a simpler switch implementation.

international conference on parallel processing | 2003