Is this you? Create Your Porfile

Claudio Schepke

Universidade Federal do Rio Grande do Sul

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Claudio Schepke is active.

Explore More

Publication

Featured researches published by Claudio Schepke.

symposium on computer architecture and high performance computing | 2009

Parallel lattice Boltzmann method with blocked partitioning

Claudio Schepke; Nicolas Maillard; Philippe Olivier Alexandre Navaux

This paper presents and discusses a blocked parallel implementation of bi- and three-dimensional versions of the Lattice Boltzmann Method. This method is used to represent and simulate fluid flows following a mesoscopic approach. Most traditional parallel implementations use simple data distribution strategies to parallelize the operations on the regular fluid data set. However, it is well known that block partitioning is usually better. Such a parallel implementation is discussed and its communication cost is established. Fluid flows simulations crossing a cavity have also been used as a real-world case study to evaluate our implementation. The presented results with our blocked implementation achieve a performance up to 31% better than non-blocked versions, for some data distributions. Thus, this work shows that blocked, parallel implementations can be efficiently used to reduce the parallel execution time of the method.

symposium on computer architecture and high performance computing | 2010

I/O Performance Evaluation on Multicore Clusters with Atmospheric Model Environment

Carla Osthoff; Claudio Schepke; Jairo Panetta; Pablo Javier Grunmann; Nicolas Maillard; Philippe Olivier Alexandre Navaux; Pedro L. Silva Dias; Pedro Pais Lopes

This work evaluates the I/O performance in a multicorecluster environment for an atmosphere model for weather and climate simulations. It contains large data sets for I/Oin scientific applications. The analysis demonstrates that the scalability of the system gets worse as we increase the number of cores per machine, with greater impact on output operations. We also demonstrate poor capacity of the multicore system for providing high aggregate I/O bandwidth and that the scalability is not improved when I/O operations are running trough a parallel file system neither running on local disk.

International Journal of Parallel Programming | 2013

Online Mesh Refinement for Parallel Atmospheric Models

Claudio Schepke; Nicolas Maillard; Joerg Schneider; Hans-Ulrich Heiss

Forecast precisions of climatological models are limited by computing power and time available for the executions. As more and faster processors are used in the computation, the resolution of the mesh adopted to represent the Earth’s atmosphere can be increased, and consequently the numerical forecast is more accurate. However, a finer mesh resolution, able to include local phenomena in a global atmosphere integration, is still not possible due to the large number of data elements to compute in this case. To overcome this situation, different mesh refinement levels can be used at the same time for different areas of the domain. Thus, our paper evaluates how mesh refinement at run time (online) can improve performance for climatological models.The online mesh refinement (OMR) increases dynamically mesh resolution in parts of a domain,when special atmosphere conditions are registered during the execution. Experimental results show that the execution of a model improved by OMR provides better resolution for the meshes, without any significant increase of execution time. The parallel performance of the simulations is also increased through the creation of threads in order to explore different levels of parallelism.

symposium on computer architecture and high performance computing | 2007

Performance Improvement of the Parallel Lattice Boltzmann Method Through Blocked Data Distributions

Claudio Schepke; Nicolas Maillard

This paper presents a blocked parallel implementation of a three-diagonal version of the Lattice Boltzmann Method. This method is a numerical model used to represent and to simulate fluid flows through mesoscopic approaches. Parallel implementations are often adopted to attend the demand of an expressive memory amount and processing power of the method. However, most implementations use simple data distribution strategies to parallelize the operations on the regular fluid data set. Fluid flows simulations crossing a cavity have been used as case study to evaluate our implementation. The presented results with blocked implementations achieve a performance 31% higher than non-blocked versions for some data distributions. Thus, this work shows that blocked implementations can be efficiently used to reduce the parallel execution time of the method.

symposium on computer architecture and high performance computing | 2011

Trace-Based Visualization as a Tool to Understand Applications' I/O Performance in Multi-core Machines

Rodrigo Virote Kassick; Francieli Zanon Boito; Matthias Diener; Philippe Olivier Alexandre Navaux; Yves Denneulin; Claudio Schepke; Nicolas Maillard; Carla Osthoff; Pablo Javier Grunmann; Pedro L. Silva Dias; Jairo Panetta

This paper presents the use of trace-based performance visualization of a large scale atmospheric model, the Ocean-Land-Atmosphere Model (OLAM). The trace was obtained with the libRastro library, and the visualization was done with Paj´e. The use of visualization aimed to analyze OLAMs performance and to identify its bottlenecks. Especially, we are interested in the models I/O operations, since it was proved to be the main issue for the models performance. We show that most of the time spent in the output routine is spent in the close operation. With this information, we delayed this operation until the next output phase, obtaining improved I/O performance.

symposium on computer architecture and high performance computing | 2011

Why Online Dynamic Mesh Refinement is Better for Parallel Climatological Models

Claudio Schepke; Nicolas Maillard; Joerg Schneider; Hans-Ulrich Heiss

Forecast precisions of climatological models are limited by computing power and time available for the executions. As more and faster processors are used in the computation, the resolution of the mesh adopted to represent the Earths atmosphere can be increased, and consequently the numerical forecast is more accurate and shows local phenomena. However, a finer mesh resolution, able to include local phenomena in a global atmosphere integration, is still not possible. To overcome this situation, different mesh refinement levels can be used at the same time for different areas. In this context, this paper evaluates how mesh refinement at run time can improve performance for climatological models. In order to contribute with this analysis, an online dynamic mesh refinement was developed. It increases mesh resolution in parts of a parallel distributed model, when special atmosphere conditions are registered during the execution. The results show that the parallel execution of this improvement provides better resolution for the meshes, without a significant increase of execution time.

Archive | 2012

Improving Atmospheric Model Performance on a Multi-Core Cluster System

Carla Osthoff; Roberto P. Souto; Fabrício Vilasbôas; Pablo Javier Grunmann; Pedro L. Silva Dias; Francieli Zanon Boito; Rodrigo Virote Kassick; Laércio Lima Pilla; Philippe Olivier Alexandre Navaux; Claudio Schepke; Nicolas Maillard; Jairo Panetta; Pedro Pais Lopes; Robert Walko

Numerical models have been used extensively in the last decades to understand and predict weather phenomena and the climate. In general, models are classified according to their operation domain: global (entire Earth) and regional (country, state, etc). Global models have spatial resolution of about 0.2 to 1.5 degrees of latitude and therefore cannot represent very well the scale of regional weather phenomena. Their main limitation is computing power. On the other hand, regional models have higher resolution but are restricted to limited area domains. Forecasting on limited domain demands the knowledge of future atmospheric conditions at domain’s borders. Therefore, regional models require previous execution of global models.

2012 13th Symposium on Computer Systems | 2012

Exploring Multi-level Parallelism in Atmospheric Applications

Claudio Schepke; Nicolas Maillard

Forecast precisions of climatological models are limited by computing power and time available for the executions. The more and faster processors are used in the computation, the resolution of the mesh adopted to represent the Earths atmosphere can be increased, and consequently the numerical forecasts are more accurate. With the introduction of multi-core processors and GPU boards, computer architectures have many parallel layers. Today, there are parallelism inside a processor, among processors and among computers. In order to best utilize the performance of the computers it is necessary to consider all parallel levels to distribute a concurrent application. However, no parallel programming interface abstracts well these different parallel levels. Based in this context, this work proposes the use of mixed programming interfaces to improve performance to atmospheric models. The parallel execution of simulations shows that the use of GPUs and multi-core CPUs in distributed systems can reduce considerably the execution time of climatological applications.

International Journal of Information Technology, Communications and Convergence | 2012

Atmospheric models hybrid OpenMP/MPI implementation multicore cluster evaluation

Carla Osthoff; Francieli Zanon Boito; Rodrigo Virote Kassick; Laércio Lima Pilla; Philippe Olivier Alexandre Navaux; Claudio Schepke; Jairo Panetta; Pablo Javier Grunmann; Nicolas Maillard; Pedro L. Silva Dias; Robert Walko

Atmospheric models usually demand high processing power and generate large amounts of data. As the degree of parallelism grows, the I/O operations may become the major impacting factor of their performance. This work shows that a hybrid MPI/OpenMP implementation can improve the performance of the atmospheric model ocean-land-atmosphere model (OLAM) on a multicore cluster environment. We show that the hybrid MPI/OpenMP version of OLAM decreases the number of output files, resulting in better performance for I/O operations. We have evaluated OLAM on the parallel file system PVFS and shown that storing the files on PVFS results in lower performance than using the local disks of the cluster nodes due as a consequence of file creation and network concurrency. We have also shown that further parallel optimisations should be included in the hybrid version in order to improve the parallel execution time of OLAM.

WISP | 2011

Improving Performance on Atmospheric Models through a Hybrid OpenMP/MPI Implementation

Carla Osthoff Ferreira de Barros; Pablo Javier Grunmann; Francieli Zanon Boito; Rodrigo Virote Kassick; Laércio Lima Pilla; Philippe Olivier Alexandre Navaux; Claudio Schepke; Jairo Panetta; Nicolas Maillard; Pedro L. Silva Dias; Robert Walko

Explore More