Anselmo Antunes Montenegro

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Anselmo Antunes Montenegro is active.

Explore More

Publication

Featured researches published by Anselmo Antunes Montenegro.

computational science and engineering | 2008

Automatic Dynamic Task Distribution between CPU and GPU for Real-Time Systems

Mark Joselli; Marcelo Zamith; Esteban Clua; Anselmo Antunes Montenegro; Aura Conci; Regina Célia P. Leal-Toledo; Luis Valente; Bruno Feijó; Marcos Cordeiro d'Ornellas; Cesar Tadeu Pozzer

The increase of computational power of programmable GPU (graphics processing unit) brings new concepts for using these devices for generic processing. Hence, with the use of the CPU and the GPU for data processing come new ideas that deals with distribution of tasks among CPU and GPU, such as automatic distribution. The importance of the automatic distribution of tasks between CPU and GPU lies in three facts. First, automatic task distribution enables the applications to use the best of both processors. Second, the developer does not have to decide which processor will do the work, allowing the automatic task distribution system to choose the best option for the moment. And third, sometimes, the application can be slowed down by other processes if the CPU or GPU is already overloaded. Based on these facts, this paper presents new schemes for efficient automatic task distribution between CPU and GPU. This paper also includes tests and results of implementing those schemes with a test case and with a real-time system.

international conference on computer graphics and interactive techniques | 2008

A new physics engine with automatic process distribution between CPU-GPU

Mark Joselli; Esteban Clua; Anselmo Antunes Montenegro; Aura Conci; Paulo A. Pagliosa

The Graphics Processing Units or simply GPUs have evolved into extremely powerful and flexible processors. This flexibility and power have allowed new concepts in general purpose computation to emerge. This paper presents a new architecture for physics engines focusing on the simulation of rigid bodies with some of its methods implemented on the GPU. Sending physics computation to the GPU enables the unloading of the required computations from the CPU, allowing it to process other tasks and optimizations. Another important reason for using the GPU is to allow physics engines to process a higher number of bodies in the simulation. It also presents an automatic process distribution scheme between CPU and GPU. The importance of the automatic distribution for physics simulation arises from the fact that, sometimes, the simulated scene characteristics may change during the simulation and by using an automatic distribution scheme the system may obtain the best performance of both processors (CPU and GPU). Also, with an automatic distribution mode, the developer does not have to decide which processor will do the work allowing the system to choose between CPU and GPU. This paper also presents an uncoupled multithread game loop used by the physics engine.

conference on computability in europe | 2009

An adaptative game loop architecture with automatic distribution of tasks between CPU and GPU

Mark Joselli; Marcelo Zamith; Esteban Clua; Anselmo Antunes Montenegro; Regina Célia P. Leal-Toledo; Aura Conci; Paulo A. Pagliosa; Luis Valente; Bruno Feijó

This article presents a new architecture to implement all game loop models for games and real-time applications that use the GPU as a mathematics and physics coprocessor, working in parallel processing mode with the CPU. The presented model applies automatic task distribution concepts. The architecture can apply a set of heuristics defined in Lua scripts in order to get acquainted with the best processor for handling a given task. The model applies the GPGPU (general-purpose computation on GPUs) paradigm. In this article we propose an architecture that acquires knowledge about the hardware by running tasks in each processor and, by studying their performance over time, finding the best processor for a group of tasks.

2009 VIII Brazilian Symposium on Games and Digital Entertainment | 2009

A Neighborhood Grid Data Structure for Massive 3D Crowd Simulation on GPU

Mark Joselli; Erick Baptista Passos; Marcelo Zamith; Esteban Clua; Anselmo Antunes Montenegro; Bruno Feijó

Simulation and visualization of emergent crowd in real-time is a computationally intensive task. This intensity mostly comes from the

conference on computability in europe | 2008

A game loop architecture for the GPU used as a math coprocessor in real-time applications

Marcelo Zamith; Esteban Clua; Aura Conci; Anselmo Antunes Montenegro; Regina Célia P. Leal-Toledo; Paulo A. Pagliosa; Luis Valente; Bruno Feij

O(n^2)

conference on computability in europe | 2009

A bidimensional data structure and spatial optimization for supermassive crowd simulation on GPU

Erick Baptista Passos; Mark Joselli; Marcelo Zamith; Esteban Clua; Anselmo Antunes Montenegro; Aura Conci; Bruno Feijó

complexity of the traversal algorithm, necessary for the proximity queries of all pair of entities in order to compute the relevant mutual interactions. Previous works reduced this complexity by considerably factors, using adequate data structures for spatial subdivision and parallel computing on modern graphic hardware, achieving interactive frame rates in real-time simulations. However, the performance of existent proposals are heavily affected by the maximum density of the spatial subdivision cells, which is usually high, yet leading to algorithms that are not optimal. In this paper we extend previous neighborhood data structure, which is called neighborhood grid, and a simulation architecture that provides for extremely low parallel complexity. Also, we implement a representative flocking boids case-study from which we run benchmarks with simulation and rendering of up to 1 million boids at interactive frame-rates. We remark that this work can achive a minimum spee up of 2.94 when compared to traditional spatial subdivision methods with a similar visual experience and with lesser use of memory.

conference on computability in europe | 2009

Smart composition of game objects using dependency injection

Erick Baptista Passos; Jonhnny Weslley S. Sousa; Esteban Clua; Anselmo Antunes Montenegro; Leonardo Murta

This article concerns the use of a graphics processor unit (GPU) as a math co-processor in real-time applications in special games and physics simulations. To validate this approach, we present a new game loop architecture that employs GPUs for general-purpose computations (GPGPUs). A critical issue here is the process distribution between the CPU and the GPU. The architecture consists of a model for distribution, and our implementation offers many advantages in comparison to other approaches without the GPGPU stage. This architecture can be used either by a general-purpose language such as the Compute Unified Device Architecture (CUDA), or shader languages such as the High-Level Shader Language (HLSL) and the OpenGL Shading Language (GLSL). Although the architecture proposed here aims at supporting mathematics and physics on the GPU, it is possible to adapt any kind of generic computation. This article discusses the model implementation in an open-source game engine and presents the results of using this platform.

2010 Brazilian Symposium on Games and Digital Entertainment | 2010

Fluid Simulation with Two-Way Interaction Rigid Body Using a Heterogeneous GPU and CPU Environment

Jose Ricardo Silva Junior; Esteban Clua; Anselmo Antunes Montenegro; Paulo A. Pagliosa

Computing and presenting emergent crowd simulations in real time is a computationally intensive task. This intensity is mostly due to the complexity of the traversal algorithm needed for the interactions of all elements against each other on the basis of a proximity query. By using special data structures such as grids, and due to the parallel nature of graphics hardware, relevant previou work reduced this complexity considerably, making it possible to achieve interactive frame rates. However, existing proposals tend to be heavily bound by the maximum density of such grids, which is usually high, leading to arguably inefficient algorithms. In this article we propose the use of a fine- grained grid and accompanying data manipulation, to lead to scalable algorithmic complexity. We also implement a representative flocking boids case study, from which we ran benchmarks with more than one million simulated and rendered boids at nearly 30fps. We remark that related previous work achieved no more than 15,000 boids with interactive frame rates.

international conference on entertainment computing | 2012

A hybrid GPU rasterized and ray traced rendering pipeline for real time rendering of per pixel effects

Thales Luis Sabino; Paulo Andrade; Esteban Clua; Anselmo Antunes Montenegro; Paulo A. Pagliosa

Most game engines are based on inheritance of game objects and/or componentization of behaviors. While this approach enables clear visualization of the system architecture, good code reuse, and fast prototyping, it brings some issues, mostly related to the high dependency between game objects/components instances. This dependency often leads to static casts and null pointer references that are difficult to debug. In this article we propose the use of the dependency injection design pattern to safely initialize game objects and lessen the programmers role in handling these issues both during the prototyping and production phases. Since these dependencies are attributes of game objects and the injection occurs only at the initialization pass, there is no performance penalty at the game loop.

Computer-Aided Engineering | 2011

Using graph cuts in GPUs for color based human skin segmentation

Lucas Lattari; Anselmo Antunes Montenegro; Aura Conci; Esteban Clua; Virgínia Fernandes Mota; Marcelo Bernardes Vieira; Gabriel Lizarraga

Simulation of natural phenomena, such as water and smoke, is a very important topic to increase real time scene realism in video-games. Besides the graphical aspect, in order to achieve realism, it is necessary to correctly simulate and solve its complex governing equations, requiring an intense computational work.Fluid simulation is achieved by solving the Navier-Stokes set of equations, using a numerical method in CPU or GPU, independently, as these equations do not have an analytical solution. The real time simulacraon also requires the simulation of interaction of the particles with objects in the scene, requiring many collision and contact forces calculation, which may drastically increase the computational time. In this paper we propose an heterogeneous multicore CPU and GPU hybrid architecture for fluid simulation with two-ways of interaction between them, and with a fine granularity control over rigid bodys shape collision. We also show the impact of this heterogeneous architecture over GPU and CPU bounded simulations, which is commonly used for this kind of application. The heterogeneous architecture developed in this work is developed to best fit the Single Instruction Multiple Thread (SIMT) model used by GPUs in all simulation stages, allowing a high level performance increase.

Explore More