Anna Sikora
Autonomous University of Barcelona
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anna Sikora.
parallel computing | 2012
Renato Miceli; Gilles Civario; Anna Sikora; Eduardo César; Michael Gerndt; Houssam Haitof; Carmen B. Navarrete; Siegfried Benkner; Martin Sandrieser; Laurent Morin; François Bodin
Performance analysis and tuning is an important step in programming multicore- and manycore-based parallel architectures. While there are several tools to help developers analyze application performance, no tool provides recommendations about how to tune the code. The AutoTune project is extending Periscope, an automatic distributed performance analysis tool developed by Technische Universitat Munchen, with plugins for performance and energy efficiency tuning. The resulting Periscope Tuning Framework will be able to tune serial and parallel codes for multicore and manycore architectures and return tuning recommendations that can be integrated into the production version of the code. The whole tuning process --- both performance analysis and tuning --- will be performed automatically during a single run of the application.
International Journal of Parallel Programming | 2014
Claudia Rosas; Anna Sikora; Josep Jorba; Andreu Moreno; Eduardo César
Data-intensive applications are those that explore, query, analyze, and, in general, process very large data sets. Generally, these applications can be naturally implemented in parallel but, in many cases, these implementations show severe performance problems mainly due to load imbalances, inefficient use of available resources, and improper data partition policies. It is worth noticing that the problem becomes more complex when the conditions causing these problems change at run time. This paper proposes a methodology for dynamically improving the performance of certain data-intensive applications based on: adapting the size and number of data partitions, and the number of processing nodes, to the current application conditions in homogeneous clusters. To this end, the processing of each exploration is monitored and gathered data is used to dynamically tune the performance of the application. The tuning parameters included in the methodology are: (i) the partition factor of the data set, (ii) the distribution of the data chunks, and (iii) the number of processing nodes to be used. The methodology assumes that a single execution includes multiple related explorations on the same partitioned data set, and that data chunks are ordered according to their processing times during the application execution to assign first the most time consuming partitions. The methodology has been validated using the well-known bioinformatics tool—BLAST—and through extensive experimentation using simulation. Reported results are encouraging in terms of reducing total execution time of the application (up to a 40 % in some cases).
Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications | 2016
Anna Sikora; Eduardo César; Isaías Comprés; Michael Gerndt
The main problem when trying to optimize the parameters of libraries, such as MPI, is that there are many parameters that users can configure. Moreover, predicting the behavior of the library for each configuration is non-trivial. This makes it very difficult to select good values for these parameters. This paper proposes a model for autotuning MPI applications. The model is developed to analyze different parameter configurations and is expected to aid users to find the best performance for executing their applications. As part of the AutoTune project, our work is ultimately aiming at extending Periscope to apply automatic tuning to parallel applications. In particular, our objective is to provide a straightforward way of tuning MPI parallel codes. The output of the framework are tuning recommendations that can be integrated into the production version of the code. Experimental tests demonstrate that this methodology could lead to significant performance improvements.
The Journal of Supercomputing | 2014
Carlos Brun; Tomàs Margalef; Ana Cortés; Anna Sikora
The Two-Stage forest fire spread prediction methodology was developed to enhance forest fire evolution forecast by tackling the uncertainty of some environmental conditions. However, there are parameters, such as wind, that present a variation along terrain and time. In such cases, it is necessary to couple forest fire propagation models and complementary models, such as meteorological forecast and wind field models. This multi-model approach improves the accuracy of the predictions by introducing an overhead in the execution time. In this paper, different multi-model approaches are discussed and the results show that the propagation prediction is improved. Exploiting multi-core architectures of current processors, we can reduce the overhead introduced by complementary models.
Scientific Programming | 2014
Andrea Martínez; Anna Sikora; Eduardo César; Joan Sorribes
The spectacular growth in the number of cores in current supercomputers poses design challenges for the development of performance analysis and tuning tools. To be effective, such analysis and tuning tools must be scalable and be able to manage the dynamic behaviour of parallel applications. In this work, we present ELASTIC, an environment for dynamic tuning of large- scale parallel applications. To be scalable, the architecture of ELASTIC takes the form of a hierarchical tuning network of nodes that perform a distributed analysis and tuning process. Moreover, the tuning network topology can be configured to adapt itself to the size of the parallel application. To guide the dynamic tuning process, ELASTIC supports a plugin architecture. These plugins, called ELASTIC packages, allow the integration of different tuning strategies into ELASTIC. We also present experimental tests conducted using ELASTIC, showing its effectiveness to improve the performance of large-scale parallel applications.
Journal of Parallel and Distributed Computing | 2017
Eduardo César; Ana Cortés; Antonio Espinosa; Tomàs Margalef; Juan C. Moure; Anna Sikora; Remo Suppi
Abstract Nowadays, many fields of science and engineering are evolving through the joint contribution of complementary fields. Computer science, and especially High Performance Computing, has become a key factor in the development of many research fields, establishing a new paradigm called computational science. Researchers and professionals from many different fields require knowledge of High Performance Computing, including parallel programming, to develop fruitful and efficient work in their particular field. Therefore, at Universitat Autonoma of Barcelona (Spain), an interdisciplinary Master on “Modeling for Science and Engineering” was started 5 years ago to provide a thorough knowledge of the application of modeling and simulation to graduate students in different fields (Mathematics, Physics, Chemistry, Engineering, Geology, etc.). In this Master’s degree, “Parallel Programming” appears as a compulsory subject because it is a key topic for them. The concepts learned in this subject must be applied to real applications. Therefore, a complementary subject on “Applied Modeling and Simulation” has also been included. It is very important to show the students how to analyze their particular problems, think about them from a computational perspective and consider the related performance issues. So, in this paper, the methodology and the experience in introducing computational thinking, parallel programming and performance engineering in this interdisciplinary Master’s degree are shown. This overall approach has been refined through the Master’s life, leading to excellent academic results and improving the industry and students appraisal of this programme.
international conference on conceptual structures | 2014
César Allande; Josep Jorba; Anna Sikora; Eduardo César
Abstract The performance of OpenMP applications executed in multisocket multicore processors can be limited by the memory interface. In a multisocket environment, each multicore processor can present a performance degradation in memory-bound parallel regions when sharing the same Last Level Cache (LLC). We propose a characterization of the performance of parallel regions to estimate cache misses and execution time. This model is used to select the number of threads and affinity distribution for each parallel region. The model is applied for SP and MG benchmarks from the NAS Parallel Benchmark Suite using different workloads on two different multicore, multisocket systems. The results shown that the estimation preserves the behavior shown in measured executions for the affinity configurations evaluated. Estimated execution time is used to select a set of configurations in order to minimize the impact of memory contention, achieving significant improvements compared with a default configuration using all threads.
international conference on conceptual structures | 2013
Andrea Martínez; Anna Sikora; Eduardo César; Joan Sorribes
Abstract Automatic analysis and tuning is a key strategy that helps to exploit the potential of high performance systems. However, for parallel applications with long running times, dynamic behaviour or highly data dependent performance patterns, it is necessary to make use of the strength of dynamic auto-tuning. An important factor in dynamic auto-tuning on a large scale is the number of additional resources required by the tuning system itself in order to reduce impact on the application performance. A tradeoff must be made between the loss of effectiveness of a tuning system using too few resources and the loss of its efficiency using too many resources. Most automatic analysis or tuning systems do not provide assistance for defining how many additional resources are required. In this work, we address this problem proposing a method focused on calculating the structure of hierarchical tuning networks. The topology will be composed of the minimum number of non-saturated resources. Experimental evaluation performed covers different use cases, each one showing that tuning networks built according to our proposal make efficient use of resources, while providing a high quality analysis and tuning environment.
Software Quality Journal | 2018
Michael Gerndt; Siegfried Benkner; Eduardo César; Carmen B. Navarrete; Enes Bajrovic; Jiri Dokulil; Carla Guillén; Robert Mijakovic; Anna Sikora
Developing software applications for high-performance computing (HPC) requires careful optimizations targeting a myriad of increasingly complex, highly interrelated software, hardware and system components. The demands placed on minimizing energy consumption on extreme-scale HPC systems and the associated shift towards hete rogeneous architectures add yet another level of complexity to program development and optimization. As a result, the software optimization process is often seen as daunting, cumbersome and time-consuming by software developers wishing to fully exploit HPC resources. To address these challenges, we have developed the Periscope Tuning Framework (PTF), an online automatic integrated tuning framework that combines both performance analysis and performance tuning with respect to the myriad of tuning parameters available to today’s software developer on modern HPC systems. This work introduces the architecture, tuning model and main infrastructure components of PTF as well as the main tuning plugins of PTF and their evaluation.
european conference on parallel processing | 2015
Eduardo César; Ana Cortés; Antonio Espinosa; Tomàs Margalef; Juan C. Moure; Anna Sikora; Remo Suppi
Nowadays many fields of science and engineering are evolving by the joint contribution of complementary fields. Computer science, and especially high performance computing, has become a key factor in the development of many research fields, establishing a new paradigm called computational science. Researchers and professionals from many different fields require a knowledge of high performance computing, including parallel programming, to develop a fruitful work in their particular field. So, at Universitat Autonoma of Barcelona, an interdisciplinary master on Modeling for science and engineering was started 5 years ago to provide a deep knowledge on the application of modeling and simulation to graduate students on different fields (Mathematics, Physics, Chemistry, Engineering, Geology, etc.). In this master, Parallel Programming appears as a compulsory subject, because it is a key topic for them. The concepts learnt in parallel programming must be applied to real applications. Therefore, a subject on Applied Modelling and Simulation has also been included. In this paper, the experience on teaching parallel programming in such interdisciplinary master is shown.