Manuel P. Malumbres
Polytechnic University of Valencia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Manuel P. Malumbres.
international parallel and distributed processing symposium | 1996
Manuel P. Malumbres; José Duato; Josep Torrellas
This paper presents an efficient routing and flow control mechanism to implement multidestination message passing in wormhole networks. It is targeted to situations where the size of message data is very small, like in invalidation and update messages in distributed shared-memory multiprocessors (DSMs) with hardware cache coherence. The mechanism is a variation of tree-based multicast with pruning to avoid deadlocks. The new scheme does not require that the destination addresses in a given multicast message be ordered, thereby avoiding any ordering overhead. It allows messages to use any deadlock-free routing function and only requires one startup for each multicast message. The new scheme has been evaluated on several k-ary n-cube networks under synthetic loads. The results show that the proposed scheme is faster than other multicast mechanisms when the multicast traffic is composed of short messages.
parallel computing in electrical engineering | 2006
A. Rodríguez; A. González; Manuel P. Malumbres
Last generation video encoding standards increase computing demands in order to reach the limits on compression efficiency. This is particularly the case of H.264/AVC specification that is gaining interest in industry. We are interested in applying parallel processing to H.264 encoders in order to fulfill the computation requirements imposed by stressing applications like video on demand, videoconference, live broadcast, etc. Given a delivered video quality and bit rate, the main complexity parameters are image resolution, frame rate and latency. These parameters can still be pushed forward in such a way that special purpose hardware solutions are not available. Parallel processing based on off-the-shelf components is a more flexible general purpose alternative. In this work we propose a hierarchical parallelization of H.264 encoders very well suited to low cost clusters. Our proposal uses MPI message passing parallelization at two levels: GOP and frame. The GOP level encodes simultaneously several groups of consecutive frames and the frame level encodes in parallel several slices of one frame. In previous work we found that GOP parallelism alone gives good speed-up but imposes very high latency, on the other side frame parallelism gets less efficiency but low latency. Combining both approaches we obtain a compromise between speed-up and latency and then a broader spectrum of applications can be covered
parallel computing | 1997
Federico Silla; Manuel P. Malumbres; Antonio Robles; Pedro López; José Duato
Networks of workstations are rapidly emerging as a cost-effective alternative to parallel computers. Switch-based interconnects with irregular topologies allow the wiring flexibility, scalability and incremental expansion capability required in this environment. The irregularity also makes routing and deadlock avoidance on such systems quite complicated. Current proposals avoid deadlock by removing cyclic dependencies between channels. As a consequence, many messages are routed following non-minimal paths, increasing latency and wasting resources. In this paper, we propose a general methodology for the design of adaptive routing algorithms for networks with irregular topology. These routing algorithms allow messages to follow minimal paths in most cases, reducing message latency and increasing network throughput. The methodology is based on the application of the theory of deadlock avoidance proposed in [14], which increases routing flexibility by allowing cyclic dependencies between channels. As an example of application, we propose an adaptive routing algorithm for Autonet. It can be implemented either by duplicating physical channels or by splitting each physical channel into two virtual channels. In the former case, the implementation does not require a new switch design. It only requires changing the routing tables and adding links in parallel with existing ones, taking advantage of spare switch ports. In the latter case, a new switch design is required but the network topology is not changed. Preliminary evaluation results show that the new routing algorithm is able to increase throughput for random traffic by a factor of up to 2.8 with respect to the original algorithm, also reducing latency.
international parallel and distributed processing symposium | 2000
Jose Flich; Manuel P. Malumbres; Pedro López; José Duato
Networks of workstations (NOWs) are becoming increasingly popular as a cost-effective alternative to parallel computers. Typically, these networks connect processors using irregular topologies, providing the wiring flexibility, scalability, and incremental expansion capability required in this environment. In some of these networks, packets are delivered using source routing. Due to the irregular topology, the routing scheme is often non-minimal. In this paper we analyze the routing scheme used in Myrinet networks in order to improve its performance. We propose new routing algorithms that balance the utilization of the available routes and always use minimal paths. We show through simulation that the current routing schemes used in Myrinet networks can be improved by modifying only the routing software without increasing the software overhead significantly. The overall throughput can be doubled without modifying the network hardware.
Journal of Systems Architecture | 2000
Manuel P. Malumbres; José Duato
This paper presents an efficient routing and flow control mechanism to implement multidestination message passing in wormhole networks. The mechanism is a variation of tree-based multicast with pruning to recover from deadlocks and it is well suited for distributed shared-memory multiprocessors (DSMs) with hardware cache coherence. It does not require any preprocessing of multicast messages reducing notably the software overhead required to send a multicast message. Also, it allows messages to use any deadlock-free routing function. The new scheme has been evaluated by simulation using synthetic loads. It achieves multicast latency reductions of 30% on average. Also it was compared with other multicast mechanisms proving its benefits. Finally, it can be easily implemented in hardware with minimal changes to existing unicast wormhole routers.
Sensors | 2012
Otoniel López; Miguel Onofre Martínez Rach; Héctor Migallón; Manuel P. Malumbres; Alberto Bonastre; Juan J. Serrano
Monitoring pest insect populations is currently a key issue in agriculture and forestry protection. At the farm level, human operators typically must perform periodical surveys of the traps disseminated through the field. This is a labor-, time- and cost-consuming activity, in particular for large plantations or large forestry areas, so it would be of great advantage to have an affordable system capable of doing this task automatically in an accurate and a more efficient way. This paper proposes an autonomous monitoring system based on a low-cost image sensor that it is able to capture and send images of the trap contents to a remote control station with the periodicity demanded by the trapping application. Our autonomous monitoring system will be able to cover large areas with very low energy consumption. This issue would be the main key point in our study; since the operational live of the overall monitoring system should be extended to months of continuous operation without any kind of maintenance (i.e., battery replacement). The images delivered by image sensors would be time-stamped and processed in the control station to get the number of individuals found at each trap. All the information would be conveniently stored at the control station, and accessible via Internet by means of available network services at control station (WiFi, WiMax, 3G/4G, etc.).
international conference on parallel processing | 1998
Federico Silla; Manuel P. Malumbres; José Duato; Donglai Dai; Dhabaleswar K. Panda
Networks of workstations (NOWs) are becoming increasingly popular as an alternative to parallel computers. Typically, these networks present irregular topologies, providing the wiring flexibility, scalability, and incremental expansion capability required in this environment. Similar to the evolution of parallel computers, NOWs are also evolving from distributed memory to shared memory. However distances between processors are longer in NOWs, leading to higher message latency and lower network bandwidth. Therefore, one can expect the network to be a bottleneck when executing some parallel applications on a NOW supporting a shared-memory programming paradigm. The authors analyze whether the interconnection network in a NOW is able to efficiently handle the traffic generated in a DSM with the same number of processors. They evaluate the behavior of a NOW using application traces captured during the execution of several SPLASH2 applications on a DSM simulator. They show through simulation that the adaptive routing algorithm previously proposed by them almost eliminates network saturation due to its ability to support a higher sustained throughput. Therefore, adaptive routing becomes a key design issue to achieve similar performance in NOWs and tightly-coupled DSMs.
Proceedings of the international workshop on Workshop on mobile video | 2007
Miguel Martínez-Rach; Otoniel López; Pablo Piñol; Manuel P. Malumbres; Carlos Miguel Tavares Calafate
It is well known that PSNR does not always rank quality of an image or video sequence in the same way that a human being. There are many other factors considered by the human visual system and the brain. So, a lot of efforts were required to find an objective video quality metric that is able to measure the quality distortion similarly to the one perceived by the destination user. We analyze the behaviour of some of the most relevant objective quality metrics when they are applied to video compressed by a H264/AVC codec at different bit-rates and with error resilience options enabled. Video data is transmitted in a wireless MANET environment and packet losses are modelled for different scenarios including variable congestion and mobility states. We take as reference the PSNR metric and try to find out if there is a more accurate metric in terms of human quality perception that could substitute PSNR in the performance evaluation of different coding proposals under packet loss scenarios.
IEEE Transactions on Parallel and Distributed Systems | 2002
Jose Flich; Pedro López; Manuel P. Malumbres; José Duato
Networks of workstations (NOWs) are becoming increasingly popular as a cost-effective alternative to parallel computers. These networks allow the customer to connect processors using irregular topologies, providing the wiring flexibility, scalability and incremental expansion capability required in this environment. Some of these networks use source routing and wormhole switching. In particular, we are interested in Myrinet networks because they are a well-known commercial product and their behavior can be controlled by the software running on the network interfaces (the Myrinet Control Program, MCP). Usually, the Myrinet network uses up*/down* routing for computing the paths for every source-destination pair. In this paper, we propose an in-transit buffer (ITB) mechanism to improve the network performance. We apply the ITB mechanism to NOWs with up*/down* source routing, like the Myrinet, analyzing its behavior on networks with both regular and irregular topologies. The proposed scheme can be implemented on Myrinet networks by simply modifying the MCP, without changing the network hardware. We evaluate by simulation several networks with different traffic patterns using timing parameters taken from the Myrinet network. The results show that the current routing schemes used in Myrinet networks can be strongly improved by applying the ITB mechanism. In general, our proposed scheme is able to double the network throughput on medium and large NOWs. Finally, we present a first implementation of the ITB mechanism on a Myrinet network.
ieee international conference on high performance computing data and analytics | 2000
Jose Flich; Pedro López; Manuel P. Malumbres; José Duato; Tomas Rokicki
In previous papers we proposed the ITB mechanism to improve the performance of up*/down* routing in irregular networks with source routing. With this mechanism, both minimal routing and a better use of network links are guaranteed, resulting on an overall network performance improvement. In this paper, we show that the ITB mechanism can be used with any source routing sch eme in th eCOW environment. In particular, we apply ITBs to DFS and Smart routing algorithms, which provide better routes than up*/down* routing. Results show that ITB strongly improves DFS (by 63%, for 64-switchn etworks) and Smart throughput (23%, for 32-switch networks).