Dorgival O. Guedes
Universidade Federal de Minas Gerais
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Dorgival O. Guedes.
international symposium on computer architecture | 2007
Bruno Diniz; Dorgival O. Guedes; Wagner Meira; Ricardo Bianchini
The peak power consumption of hardware components affects their powersupply, packaging, and cooling requirements. When the peak power consumption is high, the hardware components or the systems that use them can become expensive and bulky. Given that components and systems rarely (if ever) actually require peak power, it is highly desirable to limit power consumption to a less-than-peak power budget, based on which power supply, packaging, and cooling infrastructure scan be more intelligently provisioned. In this paper, we study dynamic approaches for limiting the powerconsumption of main memories. Specifically, we propose four techniques that limit consumption by adjusting the power states of thememory devices, as a function of the load on the memory subsystem. Our simulations of applications from three benchmarks demonstrate that our techniques can consistently limit power to a pre-established budget. Two of the techniques can limit power with very low performance degradation. Our results also show that, when using these superior techniques, limiting power is at least as effective an energy-conservation approach as state-of-the-art technique sexplicitly designed for performance-aware energy conservation. These latter results represent a departure from current energy management research and practice.
european conference on machine learning | 2011
Hélio Marcos Paz de Almeida; Dorgival O. Guedes; Wagner Meira; Mohammed Javeed Zaki
Graph clustering, the process of discovering groups of similar vertices in a graph, is a very interesting area of study, with applications in many different scenarios. One of the most important aspects of graph clustering is the evaluation of cluster quality, which is important not only to measure the effectiveness of clustering algorithms, but also to give insights on the dynamics of relationships in a given network. Many quality evaluation metrics for graph clustering have been proposed in the literature, but there is no consensus on how do they compare to each other and how well they perform on different kinds of graphs. In this work we study five major graph clustering quality metrics in terms of their formal biases and their behavior when applied to clusters found by four implementations of classic graph clustering algorithms on five large, real world graphs. Our results show that those popular quality metrics have strong biases toward incorrectly awarding good scores to some kinds of clusters, especially seen in larger networks. They also indicate that currently used clustering algorithms and quality metrics do not behave as expected when cluster structures are different from the more traditional, clique-like ones.
symposium on computer architecture and high performance computing | 2005
Renato Ferreira; Wagner Meira; Dorgival O. Guedes; Lúcia Maria de A. Drummond; Bruno Coutinho; George Teodoro; Tulio Tavares; Renata Braga Araújo; Guilherme T. Ferreira
Data mining techniques are becoming increasingly more popular as a reasonable means to collect summaries from the rapidly growing datasets in many areas. However, as the size of the raw data increases, parallel data mining algorithms are becoming a necessity. In this paper, we present a run-time support system that was designed to allow the efficient implementation of data-mining algorithms on heterogeneous distributed environments. We believe that the runtime framework is suitable for a broader class of applications, beyond data mining. We also present a parallelization strategy that is supported by the run-time system. We show scalability results of three different data-mining algorithms that were parallelized using our approach and our run-time support. All applications scale almost linearly up to a large number of nodes.
IEEE Internet Computing | 2006
Dorgival O. Guedes; Wagner Meira; Renato Ferreira
Data mining focuses on extracting useful information from large volumes of data, and thus has been the center of much attention in recent years. Building scalable, extensible, and easy-to-use data mining systems, however, has proved to be difficult. In response, the authors developed Anteater, a service-oriented architecture for data mining that relies on Web services to achieve extensibility and interoperability, offers simple abstractions for users, and supports computationally intensive processing on large amounts of data through massive parallelism
IEEE Communications Surveys and Tutorials | 2015
Daniel F. Macedo; Dorgival O. Guedes; Luiz Filipe M. Vieira; Marcos Augusto M. Vieira; Michele Nogueira
Current implementations of Internet systems are very hard to be upgraded. The ossification of existing standards restricts the development of more advanced communication systems. New research initiatives, such as virtualization, software-defined radios, and software-defined networks, allow more flexibility for networks. However, until now, those initiatives have been developed individually. We advocate that the convergence of these overlying and complementary technologies can expand the amount of programmability on the network and support different innovative applications. Hence, this paper surveys the most recent research initiatives on programmable networks. We characterize programmable networks, where programmable devices execute specific code, and the network is separated into three planes: data, control, and management planes. We discuss the modern programmable network architectures, emphasizing their research issues, and, when possible, highlight their practical implementations. We survey the wireless and wired elements on the programmable data plane. Next, on the programmable control plane, we survey the divisor and controller elements. We conclude with final considerations, open issues and future challenges.
symposium on computer architecture and high performance computing | 2003
George Teodoro; Tulio Tavares; Bruno Coutinho; Wagner Meira; Dorgival O. Guedes
One of the main challenges to the wide use of the Internet is the scalability of the servers, that is, their ability to handle the increasing demand. Scalability in stateful servers, which comprise e-commerce and other transaction-oriented servers, is even more difficult, since it is necessary to keep transaction data across requests from the same user. One common strategy for achieving scalability is to employ clustered servers, where the load is distributed among the various servers. However, as a consequence of the workload characteristics and the need of maintaining data coherent among the servers that compose the cluster, load imbalance arise among servers, reducing the efficiency of the server as a whole. We propose and evaluate a strategy for load balancing in stateful clustered servers. Our strategy is based on control theory and allowed significant gains over configurations that do not employ the load balancing strategy, reducing the response time in up to 50% and increasing the throughput in up to 16%.
IEEE Internet Computing | 2006
Bruno Rocha; Virgílio A. F. Almeida; Dorgival O. Guedes
A routing overlay network is an application-layer overlay on the existing Internet routing substrate that allows an alternative routing service. Recent studies have suggested that such networks might contain selfish nodes, which develop their strategies by considering only their own objectives. Extremely selfish nodes, called free-riders, might even refuse to share their resources with the network, thus making overlay service unavailable to the nodes that depend on them. The authors use a game-theoretic approach to evaluate the selfish-node mechanism and increase quality of service (QoS) by detecting and excluding free-riders
international conference on parallel processing | 2008
George Teodoro; Daniel Fireman; Dorgival O. Guedes; Wagner Meira; Renato Ferreira
New architectural trends in chip design resulted in machines with multiple processing units as well as efficient communication networks, leading to the wide availability of systems that provide multiple levels of parallelism, both inter- and intra-machine. Developing applications that efficiently make use of such systems is a challenge, specially for application-domain programmers. In this paper we present a new version of the Anthill programming environment that efficiently exploits multi-level parallelism and experimental results that demonstrate such efficiency. Anthill is based on the filter-stream model; in this model, applications are decomposed into a set of filters communicating through streams, which has already been shown to be efficient for expressing inter-machine parallelism. We replaced the filter run-time environment, originally process-oriented, with an event-oriented version. This new version allow programmers to efficiently express opportunities for parallelism within each compute node through a higher-level programming abstraction. We evaluated our solution on dual- and quad-core machines with two data mining applications: Eclat and KNN. Both had drops in execution time nearly proportional to the number of cores on a single machine. When using a cluster of dual-core machines, speed-ups were close to linear on the number of available cores for both applications, confirming event-oriented Anthill performs well both on the inter- and intra-machine parallelism levels.
symposium on computer architecture and high performance computing | 2007
Wendel Mombaque dos Santos; Tatiane Gomes Teixeira; Cristian Cleder Machado; Wagner Meira; A.S. Da Silva; D.R. Ferreira; Dorgival O. Guedes
The identification of replicas in a database is fundamental to improve the quality of the information. Deduplication is the task of identifying replicas in a database that refer to the same real world entity. This process is not always trivial, because data may be corrupted during their gathering, storing or even manipulation. Problems such as misspelled names, data truncation, data input in a wrong format, lack of conventions (like how to abbreviate a name), missing data or even fraud may lead to the insertion of replicas in a database. The deduplication process may be very hard, if not impossible, to be performed manually, since actual databases may have hundreds of millions of records. In this paper, we present our parallel deduplication algorithm, called FER- APARDA. By using probabilistic record linkage, we were able to successfully detect replicas in synthetic datasets with more than 1 million records in about 7 minutes using a 20- computer cluster, achieving an almost linear speedup. We believe that our results do not have similar in the literature when it comes to the size of the data set and the processing time.MPI (Message Passing Interface) is the de facto standard in High Performance Computing. By using some MPI- 2 new features, such as the dynamic creation of processes, it is possible to implement highly efficient parallel programs that can run on dynamic and/or heterogeneous resources, provided a good schedule of the processes can be computed at run-time. A classical solution to schedule parallel programs on-line is Work Stealing. However, its use with MPI- 2 is complicated by a restricted communication scheme between the processes: namely, spawned processes in MPI-2 can only communicate with their direct parents. This work presents an on-line scheduling algorithm, called Hierarchical Work Stealing, to obtain good load-balancing of MPI- 2 programs that follow a Divide & Conquer strategy. Experimental results are provided, based on a synthetic application, the N-Queens computation. The results show that the Hierarchical Work Stealing algorithm enables the use of MPI with high efficiency, even in parallel dynamic HPC platforms that are not as homogeneous as clusters.
Computer Networks | 2013
Pedro Henrique B. Las-Casas; Dorgival O. Guedes; Jussara M. Almeida; Artur Ziviani; Humberto Torres Marques-Neto
Despite the large variety and wide adoption of different techniques to detect and filter unsolicited messages (spams), the total amount of such messages over the Internet remains very large. Some reports point out that around 80% of all emails are spams. As a consequence, significant amounts of network resources are still wasted as filtering strategies are usually performed only at the email destination server. Moreover, a considerable part of these unsolicited messages is sent by users who are unaware of their spamming activity and may thus inadvertently be classified as spammers. In this case, these oblivious users act as spambots, i.e., members of a spamming botnet. This paper proposes a new method for detecting spammers at the source network, whether they are individual malicious users or oblivious members of a spamming botnet. Our method, called SpaDeS, is based on a supervised classification technique and relies only on network-level metrics, thus not requiring inspection of message content. We evaluate SpaDeS using real datasets collected from a Brazilian broadband ISP. Our results show that our method is quite effective, correctly classifying the vast majority (87%) of the spammers while misclassifying only around 2% of the legitimate users.