Rui Machado | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rui Machado is active.

Explore More

Publication

Featured researches published by Rui Machado.

Plant and Soil | 2003

Tomato root distribution, yield and fruit quality under subsurface drip irrigation

Rui Machado; Maria do Rosário; G. Oliveira; Carlos A. M. Portas

Tomato rooting patterns were evaluated in a 2-year field trial where surface drip irrigation (R0) was compared with subsurface drip irrigation at 20 cm (RI) and 40 cm (RII) depths. Pot-transplanted plants of two processing tomato, ‘Brigade’ (C1) and ‘H3044’ (C2), were used. The behaviour of the root system in response to different irrigation treatments was evaluated through minirhizotrons installed between two plants, in proximity of the plant row. Root length intensity (L a), length of root per unit of minirhizotron surface area (cm cm−2) was measured at blooming stage and at harvest. For all sampling dates the depth of the drip irrigation tube, the cultivar and the interaction between treatments did not significantly influence L a. However differences between irrigation treatments were observed as root distribution along the soil profile and a large concentration of roots at the depth of the irrigation tubes was found. For both surface and subsurface drip irrigation and for both cultivars most of the root system was concentrated in the top 40 cm of the soil profile, where root length density ranged between 0.5 and 1.5 cm cm−3. Commercial yields (t ha−1) were 87.6 and 114.2 (R0), 107.5 and 128.1 (RI), 105.0 and 124.8 (RII), for 1997 and 1998, respectively. Differences between the 2 years may be attributed to different climatic conditions. In the second year, although no significant differences were found among treatments, slightly higher values were observed with irrigation tubes at 20 cm depth. Fruit quality was not significantly affected by treatments or by the interaction between irrigation tube depth and cultivar.

Frontiers in Plant Science | 2011

Comparative Effects of Nitrogen Fertigation and Granular Fertilizer Application on Growth and Availability of Soil Nitrogen during Establishment of Highbush Blueberry

David R. Bryla; Rui Machado

A 2-year study was done to compare the effects of nitrogen (N) fertigation and granular fertilizer application on growth and availability of soil N during establishment of highbush blueberry (Vaccinium corymbosum L. “Bluecrop”). Treatments included four methods of N application (weekly fertigation, split fertigation, and two non-fertigated controls) and four levels of N fertilizer (0, 50, 100, and 150 kg·ha−1 N). Fertigation treatments were irrigated by drip and injected with a liquid urea solution; weekly fertigation was applied once a week from leaf emergence to 60 d prior to the end of the season while split fertigation was applied as a triple-split from April to June. Non-fertigated controls were fertilized with granular ammonium sulfate, also applied as a triple-split, and irrigated by drip or microsprinklers. Weekly fertigation produced the smallest plants among the four fertilizer application methods at 50 kg·ha−1 N during the first year after planting but the largest plants at 150 kg·ha−1 N in both the first and second year. The other application methods required less N to maximize growth but were less responsive than weekly fertigation to additional N fertilizer applications. In fact, 44–50% of the plants died when granular fertilizer was applied at 150 kg·ha−1 N. By comparison, none of the plants died with weekly fertigation. Plant death with granular fertilizer was associated with high ammonium ion concentrations (up to 650 mg·L−1) and electrical conductivity (>3 dS·m−1) in the soil solution. Early results indicate that fertigation may be less efficient (i.e., less plant growth per unit of N applied) at lower N rates than granular fertilizer application but is also safer (i.e., less plant death) and promotes more growth when high amounts of N fertilizer is applied.

Computer Science - Research and Development | 2011

Unbalanced tree search on a manycore system using the GPI programming model

Rui Machado; Carsten Lojewski; Salvador Abreu; Franz-Josef Pfreundt

The recent developments in computer architectures progress towards systems with large core count (Manycore) which expose more parallelism to applications. Some applications named irregular and unbalanced applications demand a dynamic and asynchronous load balance implementation to utilize the full performance a Manycore system. For example, the recently established Graph500 benchmark aims at such applications. The UTS benchmark characterizes the performance of such irregular and unbalanced computations with a tree-structured search space that requires continuous dynamic load balancing. GPI is a PGAS API that delivers the full performance of RDMA-enabled networks directly to the application. Its programming model focuses the use of one-sided asynchronous communication, overlapping computation and communication. In this paper we address the dynamic load balancing requirements of unbalanced applications using the GPI programming model. Using the UTS benchmark, we detail the implementation of a work stealing algorithm using GPI and present the performance results. Our performance evaluation shows significant improvements when compared with the optimized MPI version with a maximum performance of 9.5 billion nodes per second on 3072 cores.

international conference on cluster computing | 2015

Evaluation of Parallel Communication Models in Nekbone, a Nek5000 Mini-Application

I. B. Ivanov; Jing Gong; Dana Akhmetova; Ivy Bo Peng; Stefano Markidis; Erwin Laure; Rui Machado; Mirko Rahn; Valeria Bartsch; Alistair Hart; Paul Fischer

Nekbone is a proxy application of Nek5000, a scalable Computational Fluid Dynamics (CFD) code used for modelling incompressible flows. The Nekbone mini-application is used by several international co-design centers to explore new concepts in computer science and to evaluate their performance. We present the design and implementation of a new communication kernel in the Nekbone mini-application with the goal of studying the performance of different parallel communication models. First, a new MPI blocking communication kernel has been developed to solve Nekbone problems in a three-dimensional Cartesian mesh and process topology. The new MPI implementation delivers a 13% performance improvement compared to the original implementation. The new MPI communication kernel consists of approximately 500 lines of code against the original 7,000 lines of code, allowing experimentation with new approaches in Nekbone parallel communication. Second, the MPI blocking communication in the new kernel was changed to the MPI non-blocking communication. Third, we developed a new Partitioned Global Address Space (PGAS) communication kernel, based on the GPI-2 library. This approach reduces the synchronization among neighbor processes and is on average 3% faster than the new MPI-based, non-blocking, approach. In our tests on 8,192 processes, the GPI-2 communication kernel is 3% faster than the new MPI non-blocking communication kernel. In addition, we have used the OpenMP in all the versions of the new communication kernel. Finally, we highlight the future steps for using the new communication kernel in the parent application Nek5000.

international conference on cluster computing | 2015

Building a Fault Tolerant Application Using the GASPI Communication Layer

Faisal Shahzad; Moritz Kreutzer; Thomas Zeiser; Rui Machado; Andreas Pieper; Georg Hager; Gerhard Wellein

It is commonly agreed that highly parallel software on Exascale computers will suffer from many more runtime failures due to the decreasing trend in the mean time to failures (MTTF). Therefore, it is not surprising that a lot of research is going on in the area of fault tolerance and fault mitigation. Applications should survive a failure and/or be able to recover with minimal cost. MPI is not yet very mature in handling failures, the User-Level Failure Mitigation (ULFM) proposal being currently the most promising approach is still in its prototype phase. In our work we use GASPI, which is a relatively new communication library based on the PGAS model. It provides the missing features to allow the design of fault-tolerant applications. Instead of introducing algorithm-based fault tolerance in its true sense, we demonstrate how we can build on (existing) clever checkpointing and extend applications to allow integrate a low cost fault detection mechanism and, if necessary, recover the application on the fly. The aspects of process management, the restoration of groups and the recovery mechanism is presented in detail. We use a sparse matrix vector multiplication based application to perform the analysis of the overhead introduced by such modifications. Our fault detection mechanism causes no overhead in failure-free cases, whereas in case of failure(s), the failure detection and recovery cost is of reasonably acceptable order and shows good scalability.

international conference on parallel processing | 2013

On the Scalability of Constraint Programming on Hierarchical Multiprocessor Systems

Rui Machado; Vasco Pedro; Salvador Abreu

Recent developments in computer architecture progress towards connected systems with a large core count, which expose more parallelism to applications, creating a hierarchical setup at the node and cluster levels. Declarative approaches such as those based on constraints are attractive to parallel programming because they concentrate on the logic of the problem. They have been successfully applied to hard problems, which usually involve searching through large problem spaces. Search lends itself naturally to parallelisation by exploiting different branches of the search tree, but scalability may be hard to achieve due to the highly dynamic load balancing requirements. In this paper we present a high-level declarative approach based on constraints and show how it benefits from efficient work-stealing based dynamic load balancing, targeted at large-scale. Our study leverages the implementation of a hierarchical work stealing scheme using a different programming model, GPI. Experimentation brought encouraging results on up to 512 cores on large instances of satisfaction and optimisation problems.

practical aspects of declarative languages | 2013

Parallel Performance of Declarative Programming Using a PGAS Model

Rui Machado; Salvador Abreu; Daniel Diaz

Constraint Programming is one approach to declarative programming where a problem is modeled as a set of variables with a domain and a set of relations constraints between them. Constraint-based Local Search builds on the idea of using constraints to describe and control local search. Problems are modeled using constraints and heuristics for which solutions are searched, using Local Search. With the progressing move toward multi and many-core systems, parallelism has become mainstream as the number of cores continues to increase. Declarative programming approaches such as those based on constraints need to be better understood and experimented in order to understand their parallel behaviour. In this paper, we discuss experiments we have been carrying out with Adaptive Search and present a new parallel version of it based on GPI, a recent API and programming model for the development of scalable parallel applications. Our experiments on different problems show interesting speed-ups and, more importantly, a better understanding of how these gains are obtained, in the context of declarative programming.

parallel computing | 2016

Extreme Scale-out SuperMUC Phase 2 - lessons learned.

Nicolay Hammer; Ferdinand Jamitzky; Helmut Satzger; Momme Allalen; Alexander Block; Anupam Karmakar; Matthias Brehm; Reinhold Bader; Luigi Iapichino; Antonio Ragagnin; Vasilios Karakasis; Dieter Kranzlmüller; Arndt Bode; Herbert Huber; Martin Kühn; Rui Machado; Daniel Grünewald; P. V. F. Edelmann; F. K. Röpke; Markus Wittmann; Thomas Zeiser; Gerhard Wellein; Gerald Mathias; Magnus Schwörer; Konstantin Lorenzen; Christoph Federrath; Ralf S. Klessen; Karl-Ulrich Bamberg; H. Ruhl; Florian Schornbaum

In spring 2015, the Leibniz Supercomputing Centre (Leibniz-Rechenzentrum, LRZ), installed their new Peta-Scale System SuperMUC Phase2. Selected users were invited for a 28 day extreme scale-out block operation during which they were allowed to use the full system for their applications. The following projects participated in the extreme scale-out workshop: BQCD (Quantum Physics), SeisSol (Geophysics, Seismics), GPI-2/GASPI (Toolkit for HPC), Seven-League Hydro (Astrophysics), ILBDC (Lattice Boltzmann CFD), Iphigenie (Molecular Dynamic), FLASH (Astrophysics), GADGET (Cosmological Dynamics), PSC (Plasma Physics), waLBerla (Lattice Boltzmann CFD), Musubi (Lattice Boltzmann CFD), Vertex3D (Stellar Astrophysics), CIAO (Combustion CFD), and LS1-Mardyn (Material Science). The projects were allowed to use the machine exclusively during the 28 day period, which corresponds to a total of 63.4 million core-hours, of which 43.8 million core-hours were used by the applications, resulting in a utilization of 69%. The top 3 users were using 15.2, 6.4, and 4.7 million core-hours, respectively.

ieee international conference on high performance computing, data, and analytics | 2016

The EPiGRAM Project: Preparing Parallel Programming Models for Exascale

Stefano Markidis; Ivy Bo Peng; Jesper Larsson Träff; Antoine Rougier; Valeria Bartsch; Rui Machado; Mirko Rahn; Alistair Hart; Daniel J. Holmes; Mark Bull; Erwin Laure

EPiGRAM is a European Commission funded project to improve existing parallel programming models to run efficiently large scale applications on exascale supercomputers. The EPiGRAM project focuses on the two current dominant petascale programming models, message-passing and PGAS, and on the improvement of two of their associated programming systems, MPI and GASPI. In EPiGRAM, we work on two major aspects of programming systems. First, we improve the performance of communication operations by decreasing the memory consumption, improving collective operations and introducing emerging computing models. Second, we enhance the interoperability of message-passing and PGAS by integrating them in one PGAS-based MPI implementation, called EMPI4Re, implementing MPI endpoints and improving GASPI interoperability with MPI. The new EPiGRAM concepts are tested in two large-scale applications, iPIC3D, a Particle-in-Cell code for space physics simulations, and Nek5000, a Computational Fluid Dynamics code.

international conference on large-scale scientific computing | 2015

Task-Based Parallel Sparse Matrix-Vector Multiplication (SpMVM) with GPI-2

Dimitar Stoyanov; Rui Machado; Franz-Josef Pfreundt

We present a task-based implementation of SpMVM with the PGAS communication library GPI-2. This computational kernel is essential for the overall performance of the Krylov subspace solvers but its proper hybrid parallel design is nowadays still a challenge on hierarchical architectures consisting of multi- and many-core sockets and nodes. The GPI-2 library allows, by default and in a natural way, a task-based parallelization. Thus, our implementation is fully asynchronous and it considerably differs from the standard hybrid approaches combining MPI and threads/OpenMP. Here we briefly describe the GPI-2 library, our implementation of the SpMVM routine, and then we compare the performance of our Jacobi preconditioned Richardson solver against the PETSc-Richardson using Poisson BVP in a unit cube as a benchmark test. The comparison employs two types of domain decomposition and demonstrates the preemptive performance and better scalability of our task-based implementation.

Explore More