Carlos Augusto Paiva da Silva Martins

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Carlos Augusto Paiva da Silva Martins is active.

Explore More

Publication

Featured researches published by Carlos Augusto Paiva da Silva Martins.

frontiers in education conference | 2006

MSCSim -Multilevel and Split Cache Simulator

Luiza M. N. Coutinho; José Leandro D. Mendes; Carlos Augusto Paiva da Silva Martins

Learning the various structures and levels of memory hierarchy by means of conventional procedures is a complex subject. A didactic simulator of cache memory (MSCSim) was proposed and developed as part of our undergraduate research, to serve not only as an auxiliary teaching tool for the Computers Architecture professor, but also as a learning facility tool for the student carrying out his undergraduate or graduate courses. The MSCSim is a multi level simulation platform tool, split cache and memory hierarchy (cache, main and virtual memories) features, to be used for teaching, learning and research purposes. The Simulator encompasses the knowledge of the various levels of the memory hierarchy, and can be either used by students in his studies or by senior researchers. In conclusion, the MSCSim offers the user the possibility to simulate, experiment and learn the various concepts and techniques of the memory hierarchy. With its use, the students can improve their knowledge, executing practical exercises and experiments

international symposium on circuits and systems | 2006

Reconfigurable crossbar switch architecture for network processors

Henrique C. Freitas; Milene Barbosa Carvalho; Alexandre Marques Amaral; Amanda Rafaela Diniz; Carlos Augusto Paiva da Silva Martins; Luiz E. Ramos

This paper presents the proposal and development of a reconfigurable crossbar switch (RCS) architecture for network processors. Its main purpose is to increase the performance, and flexibility for environments with multiprocessors and computer clusters. The results include VHDL simulation of RCS and the use of it in a broadcast function implementation, found in message passing support middleware

job scheduling strategies for parallel processing | 2004

Reconfigurable gang scheduling algorithm

Luis Fabriicio Wanderley Goes; Carlos Augusto Paiva da Silva Martins

Using a single traditional gang scheduling algorithm cannot provide the best performance for all workloads and parallel architectures. A solution for this problem is an algorithm that is capable of dynamically changing its form (configuration) into a more appropriate one, according to environment variations and user requirements. In this paper, we propose, implement and analyze the performance of a Reconfigurable Gang Scheduling Algorithm (RGSA) using simulation. A RGSA uses combinations of independent features that are often implemented in GSAs such as: packing and re-packing schemes (alternative scheduling etc.), multiprogramming levels etc. Ideally, the algorithm may assume infinite configurations and it reconfigures itself according to entry parameters such as: performance metrics (mean utilization, mean response time of jobs etc.) and workload characteristics (mean execution time of jobs, mean parallelism degree of jobs etc.). Also ideally, a reconfiguration causes the algorithm to output the best configuration for a particular situation considering the systems state at a given moment. The main contributions of this paper are: the definition, proposal, implementation and performance analysis of RGSA.

ieee conference on electromagnetic field computation | 2010

3D parallel conjugate gradient solver optimized for GPUs

Rogerio F. Carvalho; Carlos Augusto Paiva da Silva Martins; Rose M. S. Batalha; Ana F. P. Camargos

Conjugate gradient (CG) solver implementations optimized for GPUs are developed and applied to a 3D finite element (FEM) problem. Results show that our GPUs implementations have a superior improvement in computational performance than cluster and other GPU implementations, due to algorithm and other architecture dependent optimizations.

frontiers in education conference | 2002

A new learning method of microprocessor architecture

Carlos Augusto Paiva da Silva Martins; João Batista T. Corrêa; Luís Fabrício Wanderley Góes; Luiz E. Ramos; Talles Henrique Medeiros

We present a new learning method of microprocessor architecture based on design and verification using functional simulation. Our main goals are to improve and optimize the learning process, motivating students to study and learn theoretical and practical aspects of microprocessor architecture, using functional simulators to validate the microprocessor design and to construct knowledge; and develop research activities during an undergraduate course. Our method is based on learning, constructivism theory, problem based learning, group projects, design of academic microprocessors as motivation for theory study/learning and verification of designed microprocessors through functional simulators developed by students. To validate the proposed method we analyze two microprocessors and functional simulators: a digital signal processor using ASIP and RISC concepts, and a RISC ASIP home automation processor. They were developed in a computer architecture course (computer science, PUC-Minas, Brazil) as the application of this method. In the conclusion students and professor analyze the results, highlighting the main differences, advantages and disadvantages of the new method.

Frontiers in Education | 2003

DCMSIM: didactic cache memory simulator

Eduardo S. Cordeiro; Italo G. A. Stefani; Tays C. A. P. Soares; Carlos Augusto Paiva da Silva Martins

We present a functional and structural didactic simulator of cache memory systems developed at the Pontifical Catholic University of Minas Gerais, Brazil. The development occurred during the undergraduate Computer Architecture discipline, in the Computer Science course. Its implementation is one part of a new didactic method, in which developers (students of the Computer Architecture discipline) must learn the concepts and theory of the discipline topics to correctly apply them in the simulator. In our simulator, DCMSim, there are features to allow students to construct and verify knowledge, testing and comparing several different configurations and memory access traces.

Archive | 2005

Extending Clustersim with MP And DSM Modules

Christiane V. Pousa; Luiz E. Ramos; Luís Fabrício Wanderley Góes; Carlos Augusto Paiva da Silva Martins

In this paper, we present a new version of ClusterSim (Cluster Simulation Tool), in which we included two new modules: Message-Passing (MP) and Distributed Shared Memory (DSM). ClusterSim supports the visual modeling and the simulation of clusters and their workloads for performance analysis. A modeled cluster is composed of single or multi-processed nodes, parallel job schedulers, network topologies, message-passing communications, distributed shared memory and technologies. A modeled workload is represented by users that submit jobs composed of tasks described by probability distributions and their internal structure (CPU, I/O, DSM and MPI instructions). Our main objectives in this paper are: to present a new version of ClusterSim with the inclusion of Message-Passing and Distributed Shared Memory simulation modules; to present the new software architecture and simulation model; to verify the proposal and implementation of MPI collective communication functions using different communication patterns (Message-Passing Module); to verify the proposal and implementation of DSM operations, consistency models and coherence protocols for object sharing (Distributed Shared Memory Module); to analyze ClusterSim v. 1.1 by means of two case studies. Our main contributions are the inclusion of the Message-Passing and Distributed Shared Memory simulation modules, a more detailed simulation model of ClusterSim and new features in the graphical environment.

international symposium on parallel and distributed processing and applications | 2005

A proposal of reconfigurable MPI collective communication functions

Luiz E. Ramos; Carlos Augusto Paiva da Silva Martins

Message Passing Interface (MPI) Collective Communication Functions (MCCF) are usually implemented in programming libraries utilizing invariable algorithms. Not always do such algorithms yield the best performance with all kinds of applications and over all execution environments. In this paper, we present, simulate, analytically model, verify and analyze reconfigurable MCCF that present variable structures and behaviors, in order to provide optimized configurations, flexibility and performance. In this paper we propose and present a set of optimized reconfigurable MCCF, which add flexibility and high performance to collective communications. We simulate, analytically model, verify and analyze the proposed functions, and compare them with invariable implementations. Our results show that reconfiguration at the algorithm level really yields flexibility and performance gains in MCCF.

Frontiers in Education | 2003

RJSSIM: A reconfigurable job scheduling smulator for parallel processing learning

L.Fw. Goes; Carlos Augusto Paiva da Silva Martins

In this work, we present and analyze the use of a reconfigurable job scheduling simulator called RJSSim as an aid tool for parallel processing learning. This software is a functional and performance Java-based simulator of job scheduling policies. Our objectives are: to present RJSSim; to show the use of RJSSim and reconfigurability concepts for parallel processing learning. First, we pr esent a prototype of RJSSim and the reconfigurability concept. Then we describe two case studies remarking the main parallel processing concepts and skills students can learn. In the first case study, we analyze two parallel algorithm models. In the second, we do some performance tests among parallel (reconfigurable) scheduling policies and architectures. Then, we analyze the performance through some metrics like response time, idle time etc. Finally we discuss and analyze the use of RJSSim for parallel processing learning. Our main contribution is: an implementation of RJSSim for parallel processing learning.

ieee conference on electromagnetic field computation | 2009

Superlinear Speedup in a 3-D Parallel Conjugate Gradient Solver

A. F. P. Camargos; Rose M. S. Batalha; Carlos Augusto Paiva da Silva Martins; Elson J. Silva; Gustavo Luís Soares

This paper presents a computational performance analysis of a parallel implementation of a conjugate gradient (CG) solver using domain decomposition and distributed memory computers, applied to a 3-D finite element method problem. The results show a superlinear speedup, which is not usually expected. The analysis shows why and how it can happen.

Explore More