Mauricio Ayala-Rincón
University of Brasília
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mauricio Ayala-Rincón.
workshop on logic language information and computation | 2007
André Luiz Galdino; César A. Muñoz; Mauricio Ayala-Rincón
Highly accurate positioning systems and new broadcasting technology have enabled air traffic management concepts where the responsibility for aircraft separation resides on pilots rather than on air traffic controllers. The Formal Methods Group at the National Institute of Aerospace and NASA Langley Research Center has proposed and formally verified an algorithm, called KB3D, for distributed three dimensional conflict resolution. KB3D computes resolution maneuvers where only one component of the velocity vector, i.e., ground speed, vertical speed, or heading, is modified. Although these maneuvers are simple to implement by a pilot, they are not necessarily optimal from a geometrical point of view. In general, optimal resolutions require the combination of all the components of the velocity vector. In this paper, we propose a two dimensional version of KB3D, which we call KB2D, that computes resolution maneuvers that are optimal with respect to ground speed and heading changes. The algorithm has been mechanically verified in the Prototype Verification System (PVS). The verification relies on algebraic proof techniques for the manipulation of the geometrical concepts relevant to the algorithm as well as standard deductive techniques available in PVS.
2010 VI Southern Programmable Logic Conference (SPL) | 2010
Daniel M. Muñoz; Diego F. Sánchez; Carlos H. Llanos; Mauricio Ayala-Rincón
Computation of floating-point transcendental functions has a relevant importance in a wide variety of scientific applications, where the area cost, error and latency are important requirements to be attended. This paper describes a flexible FPGA implementation of a parameterizable floating-point library for computing sine, cosine, arctangent and exponential functions using the CORDIC algorithm. The novelty of the proposed architecture is that by sharing the same resources the CORDIC algorithm can be used in two operation modes, allowing it to compute the sine, cosine or arctangent functions. Additionally, in case of the exponential function, the architectures change automatically between the CORDIC or a Taylor approach, which helps to improve the precision characteristics of the circuit, specifically for small input values after the argument reduction. Synthesis of the circuits and an experimental analysis of the errors have demonstrated the correctness and effectiveness of the implemented cores and allow the designer to choose, for general-purpose applications, a suitable bit-width representation and number of iterations of the CORDIC algorithm.
southern conference programmable logic | 2011
Janier Arias-García; Ricardo P. Jacobi; Carlos H. Llanos; Mauricio Ayala-Rincón
This work presents an architecture to compute matrix inversions in a hardware reconfigurable FPGA with single-precision floating-point representation, whose main unit is the processing component for Gauss-Jordan elimination. This component consists of other smaller arithmetic units, organized to maintain the accuracy of the results without the need to internally normalize and de-normalize the floating-point data. The implementation of the operations and the whole unit take advantage of the resources available in the Virtex-5 FPGA. The performance and resource consumption of the implementation are improvements in comparison with different more elaborated architectures whose implementations are more complex for low cost applications. Benchmarks are done with solutions implemented previously in FPGA and software, such as Matlab.
symposium on integrated circuits and systems design | 2009
Diego F. Sánchez; Daniel M. Muñoz; Carlos H. Llanos; Mauricio Ayala-Rincón
Floating-point operations are an essential requisite in a wide range of computational and engineering applications that need good performance and high precision. Current advances in VLSI technology raised the density integration fast enough, allowing the designers to develop directly in hardware several floating-point operations commonly implemented in software. Until now, most of the research has not focused on the tradeoff among the need of high performance and the cost of the size of logic area, associated with the level of precision, parameters that are very important in a wide variety of applications such as robotics, image and digital signal processing. This paper describes an FPGA implementation of a parameterizable floating-point library for addition/subtraction, multiplication, division and square root operations. Architectures based on Goldschmidt algorithm were implemented for computing floating-point division and square root. The library is parameterizable by bit-width and number of iterations. An analysis of the mean square error and the cost in area consumption is done in order to find, for general purpose applications, the feasible bit-width representation, number of iterations and number of addressable words for storing initial seeds of the Goldschmidt algorithm.
Journal of Parallel and Distributed Computing | 2007
Azzedine Boukerche; Alba Cristina Magalhaes Alves de Melo; Mauricio Ayala-Rincón; Maria Emilia Telles Walter
Sequence comparison is a basic operation in DNA sequencing projects, and most of sequence comparison methods used are based on heuristics, which are faster but there are no guarantees that the best alignments are produced. On the other hand, the algorithm proposed by Smith-Waterman obtains the best local alignments at the expense of very high computing power and huge memory requirements. In this article, we present and evaluate our experiments with three strategies to run the Smith-Waterman algorithm in a cluster of workstations using a distributed shared memory system. Our results on an eight-machine cluster presented very good speedups and indicate that impressive improvements can be achieved, depending on the strategy used. Also, we present some theoretical remarks on how to reduce the amount of memory used.
bio-inspired computing: theories and applications | 2010
Daniel M. Muñoz; Carlos H. Llanos; Leandro dos Santos Coelho; Mauricio Ayala-Rincón
Particle Swarm Optimization (PSO) algorithms have been proposed to solve engineering problems that require to find an optimal point of operation. There are several embedded applications which requires to solve online optimization problems with a high performance. However, the PSO suffers on large execution times, and this fact becomes evident when using Reduced Instruction Set Computer (RISC) microprocessors in which the operational frequencies are low in comparison with the high operational frequencies of traditional personal computers (PCs). This paper compares two hardware implementations of the parallel PSO algorithm using an efficient floating-point arithmetic which perform computations with large dynamic range and high precision. The full-parallel and the partially-parallel PSO architectures allow the parallel capabilities of the PSO to be exploited in order to decrease the running time. Three well-known benchmark test functions have been used to validate the hardware architectures and a comparison in terms of cost in logic area, quality of the solution and performance is reported. In addition, a comparison of the execution time between the hardware and two C-code software implementations, based on a Intel Core Duo at 1.6GHz and a embedded Microblaze microprocessor at 50MHz, are presented.
intelligent systems design and applications | 2009
Daniel Muñoz Arboleda; Carlos H. Llanos; Leandro dos Santos Coelho; Mauricio Ayala-Rincón
High computational cost for solving large engineering optimization problems point out the design of parallel optimization algorithms. Population based optimization algorithms provide parallel capabilities that can be explored by their implementations done directly in hardware. This paper presents a hardware implementation of Particle Swarm Optimization algorithms using an efficient floating-point arithmetic which performs the computations with high precision. All the architectures are parameterizable by bit-width, allowing the designer to choose the suitable format according to the requirements of the optimization problem. Synthesis and simulation results demonstrate that the proposed architecture achieves satisfactory results obtaining a better performance in therms of elapsed time than conventional software implementations.
field-programmable logic and applications | 2005
Carlos Morra; Jürgen Becker; Mauricio Ayala-Rincón; Reiner W. Hartenstein
FELIX is a new design space exploration tool and graphical integrated development environment (IDE) for the programming of coarse-grained reconfigurable architectures. Its main and novel advantage is the use of rewriting rules and logical strategies for the automated generation of alternative functionally equivalent implementations from a single mathematical specification. The user selection of the rewriting logic strategies to be applied determines the resulting implementations, making it possible to quickly generate, simulate and evaluate alternative implementations that are logically equivalent. The FELIX system includes an interface to the KressArray Xplorer for hardware design-space exploration. The current version of the tool is targeted for the pact extreme processing platform (XPP), with support for additional architectures planned in future versions.
ACM Transactions on Design Automation of Electronic Systems | 2006
Mauricio Ayala-Rincón; Carlos H. Llanos; Ricardo P. Jacobi; Reiner W. Hartenstein
Many algebraic operations can be efficiently implemented as pipe networks in arrays of functional units such as systolic arrays that provide a large amount of parallelism. However, the applicability of classical systolic arrays is restricted to problems with strictly regular data dependencies yielding only arrays with uniform linear pipes. This limitation can be circumvented by using reconfigurable systolic arrays or reconfigurable data path arrays, where the node interconnections and operations can be redefined even at run time. In this context, several alternative reconfigurable systolic architectures can be explored and powerful tools are needed to model and evaluate them. Well-known rewriting-logic environments such as ELAN and Maude can be used to specify and simulate complex application-specific integrated systems. In this article we propose a methodology based on rewriting-logic which is adequate to quickly model and evaluate reconfigurable architectures (RA) in general and, in particular, reconfigurable systolic architectures. As an interesting case study we apply this rewriting-logic modeling methodology to the space-efficient treatment of the Fast-Fourier Transform (FFT). The FFT prototype conceived in this way, has been specified and validated in VHDL using the Quartus II system.
international parallel and distributed processing symposium | 2005
Azzedine Boukerche; A.C.M.A. de Melo; Mauricio Ayala-Rincón
Sequence comparison is a basic operation in DNA sequencing projects, and most of sequence comparison methods used are based on heuristics, which are faster but there are no guarantees that the best alignments are produced. On the other hand, the algorithm proposed by Smith-Waterman obtains the best local alignments at the expense of very high computing power and huge memory requirements. In this article, we present and evaluate our experiments with three strategies to run the Smith-Waterman algorithm in a cluster of workstations using a distributed shared memory system. Our results on an eight-machine cluster presented very good speedups and indicate that impressive improvements can be achieved, depending on the strategy used. Also, we present some theoretical remarks on how to reduce the amount of memory used.