Anselmo Antunes Montenegro
Federal Fluminense University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anselmo Antunes Montenegro.
computational science and engineering | 2008
Mark Joselli; Marcelo Zamith; Esteban Clua; Anselmo Antunes Montenegro; Aura Conci; Regina Célia P. Leal-Toledo; Luis Valente; Bruno Feijó; Marcos Cordeiro d'Ornellas; Cesar Tadeu Pozzer
The increase of computational power of programmable GPU (graphics processing unit) brings new concepts for using these devices for generic processing. Hence, with the use of the CPU and the GPU for data processing come new ideas that deals with distribution of tasks among CPU and GPU, such as automatic distribution. The importance of the automatic distribution of tasks between CPU and GPU lies in three facts. First, automatic task distribution enables the applications to use the best of both processors. Second, the developer does not have to decide which processor will do the work, allowing the automatic task distribution system to choose the best option for the moment. And third, sometimes, the application can be slowed down by other processes if the CPU or GPU is already overloaded. Based on these facts, this paper presents new schemes for efficient automatic task distribution between CPU and GPU. This paper also includes tests and results of implementing those schemes with a test case and with a real-time system.
international conference on computer graphics and interactive techniques | 2008
Mark Joselli; Esteban Clua; Anselmo Antunes Montenegro; Aura Conci; Paulo A. Pagliosa
The Graphics Processing Units or simply GPUs have evolved into extremely powerful and flexible processors. This flexibility and power have allowed new concepts in general purpose computation to emerge. This paper presents a new architecture for physics engines focusing on the simulation of rigid bodies with some of its methods implemented on the GPU. Sending physics computation to the GPU enables the unloading of the required computations from the CPU, allowing it to process other tasks and optimizations. Another important reason for using the GPU is to allow physics engines to process a higher number of bodies in the simulation. It also presents an automatic process distribution scheme between CPU and GPU. The importance of the automatic distribution for physics simulation arises from the fact that, sometimes, the simulated scene characteristics may change during the simulation and by using an automatic distribution scheme the system may obtain the best performance of both processors (CPU and GPU). Also, with an automatic distribution mode, the developer does not have to decide which processor will do the work allowing the system to choose between CPU and GPU. This paper also presents an uncoupled multithread game loop used by the physics engine.
conference on computability in europe | 2009
Mark Joselli; Marcelo Zamith; Esteban Clua; Anselmo Antunes Montenegro; Regina Célia P. Leal-Toledo; Aura Conci; Paulo A. Pagliosa; Luis Valente; Bruno Feijó
This article presents a new architecture to implement all game loop models for games and real-time applications that use the GPU as a mathematics and physics coprocessor, working in parallel processing mode with the CPU. The presented model applies automatic task distribution concepts. The architecture can apply a set of heuristics defined in Lua scripts in order to get acquainted with the best processor for handling a given task. The model applies the GPGPU (general-purpose computation on GPUs) paradigm. In this article we propose an architecture that acquires knowledge about the hardware by running tasks in each processor and, by studying their performance over time, finding the best processor for a group of tasks.
2009 VIII Brazilian Symposium on Games and Digital Entertainment | 2009
Mark Joselli; Erick Baptista Passos; Marcelo Zamith; Esteban Clua; Anselmo Antunes Montenegro; Bruno Feijó
Simulation and visualization of emergent crowd in real-time is a computationally intensive task. This intensity mostly comes from the
conference on computability in europe | 2008
Marcelo Zamith; Esteban Clua; Aura Conci; Anselmo Antunes Montenegro; Regina Célia P. Leal-Toledo; Paulo A. Pagliosa; Luis Valente; Bruno Feij
O(n^2)
conference on computability in europe | 2009
Erick Baptista Passos; Mark Joselli; Marcelo Zamith; Esteban Clua; Anselmo Antunes Montenegro; Aura Conci; Bruno Feijó
complexity of the traversal algorithm, necessary for the proximity queries of all pair of entities in order to compute the relevant mutual interactions. Previous works reduced this complexity by considerably factors, using adequate data structures for spatial subdivision and parallel computing on modern graphic hardware, achieving interactive frame rates in real-time simulations. However, the performance of existent proposals are heavily affected by the maximum density of the spatial subdivision cells, which is usually high, yet leading to algorithms that are not optimal. In this paper we extend previous neighborhood data structure, which is called neighborhood grid, and a simulation architecture that provides for extremely low parallel complexity. Also, we implement a representative flocking boids case-study from which we run benchmarks with simulation and rendering of up to 1 million boids at interactive frame-rates. We remark that this work can achive a minimum spee up of 2.94 when compared to traditional spatial subdivision methods with a similar visual experience and with lesser use of memory.
conference on computability in europe | 2009
Erick Baptista Passos; Jonhnny Weslley S. Sousa; Esteban Clua; Anselmo Antunes Montenegro; Leonardo Murta
This article concerns the use of a graphics processor unit (GPU) as a math co-processor in real-time applications in special games and physics simulations. To validate this approach, we present a new game loop architecture that employs GPUs for general-purpose computations (GPGPUs). A critical issue here is the process distribution between the CPU and the GPU. The architecture consists of a model for distribution, and our implementation offers many advantages in comparison to other approaches without the GPGPU stage. This architecture can be used either by a general-purpose language such as the Compute Unified Device Architecture (CUDA), or shader languages such as the High-Level Shader Language (HLSL) and the OpenGL Shading Language (GLSL). Although the architecture proposed here aims at supporting mathematics and physics on the GPU, it is possible to adapt any kind of generic computation. This article discusses the model implementation in an open-source game engine and presents the results of using this platform.
2010 Brazilian Symposium on Games and Digital Entertainment | 2010
Jose Ricardo Silva Junior; Esteban Clua; Anselmo Antunes Montenegro; Paulo A. Pagliosa
Computing and presenting emergent crowd simulations in real time is a computationally intensive task. This intensity is mostly due to the complexity of the traversal algorithm needed for the interactions of all elements against each other on the basis of a proximity query. By using special data structures such as grids, and due to the parallel nature of graphics hardware, relevant previou work reduced this complexity considerably, making it possible to achieve interactive frame rates. However, existing proposals tend to be heavily bound by the maximum density of such grids, which is usually high, leading to arguably inefficient algorithms. In this article we propose the use of a fine- grained grid and accompanying data manipulation, to lead to scalable algorithmic complexity. We also implement a representative flocking boids case study, from which we ran benchmarks with more than one million simulated and rendered boids at nearly 30fps. We remark that related previous work achieved no more than 15,000 boids with interactive frame rates.
international conference on entertainment computing | 2012
Thales Luis Sabino; Paulo Andrade; Esteban Clua; Anselmo Antunes Montenegro; Paulo A. Pagliosa
Most game engines are based on inheritance of game objects and/or componentization of behaviors. While this approach enables clear visualization of the system architecture, good code reuse, and fast prototyping, it brings some issues, mostly related to the high dependency between game objects/components instances. This dependency often leads to static casts and null pointer references that are difficult to debug. In this article we propose the use of the dependency injection design pattern to safely initialize game objects and lessen the programmers role in handling these issues both during the prototyping and production phases. Since these dependencies are attributes of game objects and the injection occurs only at the initialization pass, there is no performance penalty at the game loop.
Computer-Aided Engineering | 2011
Lucas Lattari; Anselmo Antunes Montenegro; Aura Conci; Esteban Clua; Virgínia Fernandes Mota; Marcelo Bernardes Vieira; Gabriel Lizarraga
Simulation of natural phenomena, such as water and smoke, is a very important topic to increase real time scene realism in video-games. Besides the graphical aspect, in order to achieve realism, it is necessary to correctly simulate and solve its complex governing equations, requiring an intense computational work.Fluid simulation is achieved by solving the Navier-Stokes set of equations, using a numerical method in CPU or GPU, independently, as these equations do not have an analytical solution. The real time simulacraon also requires the simulation of interaction of the particles with objects in the scene, requiring many collision and contact forces calculation, which may drastically increase the computational time. In this paper we propose an heterogeneous multicore CPU and GPU hybrid architecture for fluid simulation with two-ways of interaction between them, and with a fine granularity control over rigid bodys shape collision. We also show the impact of this heterogeneous architecture over GPU and CPU bounded simulations, which is commonly used for this kind of application. The heterogeneous architecture developed in this work is developed to best fit the Single Instruction Multiple Thread (SIMT) model used by GPUs in all simulation stages, allowing a high level performance increase.