Marcelo Naiouf
National University of La Plata
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marcelo Naiouf.
Physica A-statistical Mechanics and Its Applications | 2017
Aurelio Fernández Bariviera; María José Basgall; Waldo Hasperué; Marcelo Naiouf
In recent years a new type of tradable assets appeared, generically known as cryptocurrencies. Among them, the most widespread is Bitcoin. Given its novelty, this paper investigates some statistical properties of the Bitcoin market. This study compares Bitcoin and standard currencies dynamics and focuses on the analysis of returns at different time scales. We test the presence of long memory in return time series from 2011 to 2017, using transaction data from one Bitcoin platform. We compute the Hurst exponent by means of the Detrended Fluctuation Analysis method, using a sliding window in order to measure long range dependence. We detect that Hurst exponents changes significantly during the first years of existence of Bitcoin, tending to stabilize in recent times. Additionally, multiscale analysis shows a similar behavior of the Hurst exponent, implying a self-similar process.
Concurrency and Computation: Practice and Experience | 2015
Enzo Rucci; Carlos García; Guillermo Botella; Armando Eduardo De Giusti; Marcelo Naiouf; Manuel Prieto-Matías
Alignment is essential in many areas such as biological, chemical and criminal forensics. The well‐known Smith–Waterman (SW) algorithm is able to retrieve the optimal local alignment with quadratic time and space complexity. There are several implementations that take advantage of computing parallelization, such as manycores, FPGAs or GPUs, in order to reduce the alignment effort. In this research, we adapt, develop and tune the SW algorithm named SWIMM on a heterogeneous platform based on Intels Xeon and Xeon Phi coprocessor. SWIMM is a free tool available in a public git repository https://github.com/enzorucci/SWIMM. We efficiently exploit data and thread‐level parallelism, reaching up to 380 GCUPS on heterogeneous architecture, 350 GCUPS for the isolated Xeon and 50 GCUPS on Xeon Phi. Despite the heterogeneous implementation obtaining the best performance, it is also the most energy‐demanding. In fact, we also present a trade‐off analysis between performance and power consumption. The greenest configuration is based on an isolated multicore system that exploits AVX2 instruction set architecture reaching 1.5 GCUPS/Watts. Copyright
international conference on cluster computing | 2014
Enzo Rucci; Armando Eduardo De Giusti; Marcelo Naiouf; Guillermo Botella; Carlos García; Manuel Prieto-Matías
The well-known Smith-Waterman (SW) algorithm is a high-sensitivity method for local alignments. However, SW is expensive in terms of both execution time and memory usage, which makes it impractical in many applications. Some heuristics are possible but at the expense of losing sensitivity. Fortunately, previous research have shown that new computing platforms such as GPUs and FPGAs are able to accelerate SW and achieve impressive speedups. In this paper we have explored SW acceleration on a heterogeneous platform equipped with an Intel Xeon Phi coprocessor. Our evaluation, using the well-known Swiss-Prot database as a benchmark, has shown that a hybrid CPU-Phi heterogeneous system is able to achieve competitive performance (62.6 GCUPS), even with moderate low-level optimisations.
european conference on parallel processing | 2009
Mario Leandro Bertogna; Eduardo Grosclaude; Marcelo Naiouf; Armando Eduardo De Giusti; Emilio Luque
In Grid environments, many different resources are intended to work in a coordinated manner, each resource having its own features and complexity. As the number of resources grows, simplifying automation and management is among the most important issues to address. This papers contribution lies on the extension and implementation of a grid metascheduler that dynamically discovers, creates and manages on-demand virtual clusters. The first module selects the clusters using graph heuristics. The algorithm then tries to find a solution by searching a set of clusters, mapped to the graph, that achieve the best performance for a given task. The second module, one per-grid node, monitors and manages physical and virtual machines. When a new task arrives, these modules modify virtual machines configuration or use live migration to dynamically adapt resource distribution at the clusters, obtaining maximum utilization. Metascheduler components and local administrator modules work together to make decisions at run time to balance and optimize system throughput. This implementation results in performance improvement of 20% on the total computing time, with machines and clusters processing 100% of their working time. These results allow us to conclude that this solution is feasible to be implemented on Grid environments, where automation and self-management are key to attain effective resource usage.
ieee international conference on high performance computing data and analytics | 2006
Marcelo Naiouf; Laura Cristina De Giusti; Franco Chichizola; Armando Eduardo De Giusti
This paper discusses the dynamic and static balancing of non-homogenous cluster architectures, simultaneously analyzing the theoretical parallel speedup as well as the speedup experimentally obtained. A classical application (Parallel N-Queens) with a parallel solution algorithm, where processing predominates upon communication, has been chosen so as to go deep in the load balancing aspects (dynamic or static) without distortion of results caused by communication overhead. Four interconnected clusters have been used in which the machines within each cluster have homogeneous processors although different among clusters. Thus, the set can be seen as a N-processor heterogeneous cluster or as a multi-cluster scheme with 4 subsets of homogeneous processors. At the same time, three forms of load distribution in the processors (Direct Static, Predictive Static and Dynamic by Demand) have been studied, analyzing in each case parallel speedup and load unbalancing regarding problem size and the processors used.
international symposium elmar | 2005
Franco Chichizola; L. De Giusti; A. De Giusti; Marcelo Naiouf
A new algorithm for automatic face recognition is presented: reduced image eigenfaces (RlE), based on the eigenface mode, improvements in the recognition percentage. The original eigenfaces method has been implemented in order to compare the results obtained by the new method under various conditions, such as quantity of people and quantity of photos of each of them. In the experimentation, we have used a limited-image data base which is internationally normalized. With RIE, two important advantages are achieved in relation to the previous model: improvement of the percentage of success in the recognition, and possibility of enhancing the set of images used for training
International Journal of High Performance Computing Applications | 2018
Enzo Rucci; Carlos García; Guillermo Botella; Armando Eduardo De Giusti; Marcelo Naiouf; Manuel Prieto-Matías
The well-known Smith–Waterman algorithm is a high-sensitivity method for local sequence alignment. Unfortunately, the Smith–Waterman algorithm has quadratic time complexity, which makes it computationally demanding for large protein databases. In this paper, we present OSWALD, a portable, fully functional and general implementation to accelerate Smith–Waterman database searches in heterogeneous platforms based on Altera’s FPGA. OSWALD exploits OpenMP multithreading and SIMD computing through SSE and AVX2 extensions on the host while taking advantage of pipeline and vectorial parallelism by way of OpenCL on the FPGAs. Performance evaluations on two different heterogeneous architectures with real amino acid datasets show that OSWALD is competitive in comparison with other top-performing Smith–Waterman implementations, attaining up to 442 GCUPS peak with the best GCUPS/watts ratio.The well-known Smith–Waterman algorithm is a high-sensitivity method for local sequence alignment. Unfortunately, the Smith–Waterman algorithm has quadratic time complexity, which makes it computat...
international conference of the chilean computer science society | 2008
L. De Giusti; Franco Chichizola; Marcelo Naiouf; A. De Giusti
An automatic task-to-processor mapping algorithm is analyzed in parallel systems that run over loosely coupled distributed architectures. This research is based on the TTIGHa model that allows predicting parallel application performance running over heterogeneous architectures. In particular, the heterogeneity of both processors and communications is taken into consideration. From the results obtained with the TTIGHa model, the MATEHa algorithm for task-to-processors assignment is presented and its implementation is analyzed. Experimental results working on subsets of two-cluster heterogeneous machines are presented, analyzing the resulting mapping scheme with MATEHa and two previous mapping methods: MATE and HEFT. Finally, the algorithm robustness is considered based on the variation of model parameters: inter-process communication times and processing times.
Archive | 2016
Enzo Rucci; Carlos García; Guillermo Botella; Armando Eduardo De Giusti; Marcelo Naiouf; Manuel Prieto-Matías
Searching biological sequence database is a common and repeated task in bioinformatics and molecular biology. The Smith–Waterman algorithm is the most accurate method for this kind of search. Unfortunately, this algorithm is computationally demanding and the situation gets worse due to the exponential growth of biological data in the last years. For that reason, the scientific community has made great efforts to accelerate Smith–Waterman biological database searches in a wide variety of hardware platforms. We give a survey of the state-of-the-art in Smith–Waterman protein database search, focusing on four hardware architectures: central processing units, graphics processing units, field programmable gate arrays and Xeon Phi coprocessors. After briefly describing each hardware platform, we analyse temporal evolution, contributions, limitations and experimental work and the results of each implementation. Additionally, as energy efficiency is becoming more important every day, we also survey performance/power consumption works. Finally, we give our view on the future of Smith–Waterman protein searches considering next generations of hardware architectures and its upcoming technologies.
international conference on algorithms and architectures for parallel processing | 2017
Enzo Rucci; Carlos García; Guillermo Botella; Armando Eduardo De Giusti; Marcelo Naiouf; Manuel Prieto-Matías
The well-known Smith-Waterman (SW) algorithm is the most commonly used method for local sequence alignments. However, SW is very computationally demanding for large protein databases. There are several implementations that take advantage of parallel capacities on many-cores, FPGAs or GPUs, in order to increase the alignment throughtput. In this paper, we have explored SW acceleration on Intel KNL processor. The novelty of this architecture requires the revision of previous programming and optimization techniques on many-core architectures. To the best of authors knowledge, this is the first KNL architecture assessment for SW algorithm. Our evaluation, using the renowned Environmental NR database as benchmark, has shown that multi-threading and SIMD exploitation showed competitive performance (351 GCUPS) in comparison with other implementations.