Pedro D. Medeiros | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pedro D. Medeiros is active.

Explore More

Publication

Featured researches published by Pedro D. Medeiros.

international conference on parallel processing | 2013

Algorithmic skeleton framework for the orchestration of GPU computations

Ricardo Marques; Hervé Paulino; Fernando Alexandre; Pedro D. Medeiros

The Graphics Processing Unit (GPU) is gaining popularity as a co-processor to the Central Processing Unit (CPU). However, harnessing its capabilities is a non-trivial exercise that requires good knowledge of parallel programming, more so when the complexity of these applications is increasingly rising. Languages such as StreamIt [1] and Lime [2] have addressed the offloading of composed computations to GPUs. However, to the best of our knowledge, no support exists at library level. To this extent, we propose Marrow, an algorithmic skeleton framework for the orchestration of OpenCL computations. Marrow expands the set of skeletons currently available for GPU computing, and enables their combination, through nesting, into complex structures. Moreover, it introduces optimizations that overlap communication and computation, thus conjoining programming simplicity with performance gains in many application scenarios. We evaluated the framework from a performance perspective, comparing it against hand-tuned OpenCL programs. The results are favourable, indicating that Marrows skeletons are both flexible and efficient in the context of GPU computing.

Future Generation Computer Systems | 2005

Future trends in distributed applications and problem-solving environments

José C. Cunha; Orner Rana; Pedro D. Medeiros

As grid computing technologies and infrastructures are being developed, suitable abstractions, methods, and tools will become necessary to enable application development, and software development of the components of grid computing environments. Grid computing will enable distributed applications with large numbers of involved components with dynamic interactions. This requires new approaches to understand and manage structure and behaviour, and the diversity of interactions among system components. This paper discusses emerging trends in distributed applications on large-scale and dynamic grid computing infrastructures. These trends allow us to identify the need to develop suitable software models, methods and tools for grid computing environments, in order to help specify, compose, and develop dynamic distributed large-scale applications.

high performance distributed computing | 2006

Cooperative Caching in the pCFS parallel Cluster File System

Paulo Afonso Lopes; Pedro D. Medeiros

This short paper describes the cooperative caching architecture of pCFS, a shared disk cluster file system (CFS) which aims to achieve high performance in a broad spectrum of I/O intensive applications ranging from computational access to large data sets to video streaming and databases, and includes an extended API for parallel I/O access. pCFS is targeted at small to medium sized clusters where data is stored in fibre channel shared devices on a storage area network (SAN) and exploits two interconnect fabrics: a SAN to access on-disk data, and a LAN, used both for the exchange of control information (related to locking and cache management) and for cooperative caching dataflow

european conference on parallel processing | 1997

Interconnecting Multiple Heterogeneous Parallel Application Components

Pedro D. Medeiros; José C. Cunha

We present an infrastructure for building parallel applications by interconnecting slightly modified pre-existing parallel components. This infrastructure (called PHIS) allows the cooperation of components that run in different parallel machines. In succession, we describe the rationale behind PHIS, the primitives used to interconnect the application components and its internal architecture and we compare PHIS to related systems. Finally, we present an application where PHIS is used to interconnect several distinct components that define a parallel heterogeneous computational steering architecture for genetic algorithm applications.

International Journal of Creative Interfaces and Computer Graphics | 2013

Object Identification in Binary Tomographic Images Using GPGPUs

Bruno Preto; Fernando Pedro Birra; Adriano Lopes; Pedro D. Medeiros

The authors present a hybrid OpenCL CPU/GPU algorithm for identification of connected structures inside black and white 3D scientific data. This algorithm exploits parallelism both at CPU and GPGPU levels, but the work is predominantly done in GPUs. The underlying context of this work is the structural characterization of composite materials via tomography. The algorithm allows us to later infer location and morphology of objects inside composite materials. Moreover, execution times are very low thus allowing us to process large data sets, but within acceptable running times. Intermediate solutions are computed independently over a partition of the spatial domain, following the data parallelism paradigm, and then integrated both at GPU and CPU levels, using parallel multi-cores. The authors consistently explore parallelism both at the CPU level, by allowing the CPU stage to run in multiple concurrent threads, and at the GPU level with massive parallelism and concurrent data transfers and kernel executions.

european conference on parallel processing | 2010

pCFS vs. PVFS: comparing a highly-available symmetrical parallel cluster file system with an asymmetrical parallel file system

Paulo Afonso Lopes; Pedro D. Medeiros

pCFS is a highly available parallel, symmetrical (where nodes perform both compute and I/O work) cluster file system that we have designed to run in medium-sized clusters. In this paper, using exactly the same hardware and Linux version across all nodes we compare pCFS with two distinct configurations of PVFS: one using internal disks, and therefore not able to provide any tolerance against disk and/or I/O node failures, and another where PVFS I/O servers access LUNs in a disk array and thus provide high availability (in the following named HA-PVFS). We start by measuring I/O bandwidth and CPU consumption of PVFS and HA-PVFS setups; then, the same set of tests is performed with pCFS. We conclude that, when using the same hardware, pCFS compares very favourably with HA-PVFS, offering the same or higher I/O bandwidths at a much lower CPU consumption.

international conference on cluster computing | 2008

Enhancing write performance of a shared-disk cluster filesystem through a fine-grained locking strategy

Paulo Afonso Lopes; Pedro D. Medeiros

We present part of our recent work on performance enhancement of cluster file systems using shared disks over a SAN. This work is built around the proposal of pCFS, a file system specifically targeting those environments. In we presented the objectives and design principles of pCFS and a proof-of-concept implementation, carried out by modifying Red Hatpsilas GFS , showing significant improvements in operations over files shared among processes running in different nodes. pCFS differs from GFS in two main aspects: its use of cooperative caching and a finer grain of locking. The first aspect, which used the LAN to enhance performance in write sharing situations, was described elsewhere ; we now introduce a complementary strategy - locking file regions instead of the whole file - which enables us to use the SAN while delivering a high level of performance in those same write sharing situations. pCFS may apply inter-node locks to regions, allowing processes to operate in parallel with a minimum of coherency overhead among nodes; a process cannot access outside its region(s) and, when a writer unlocks a region, others can then lock it and be able to see modified data immediately. Through a set of experiments where a file is shared between processes running in different nodes, we show that the described approach allows a gain of, at least, an order of magnitude over plain GFS.

european pvm mpi users group meeting on recent advances in parallel virtual machine and message passing interface | 2002

Porting PVM to the VIA Architecture Using a Fast Communication Library

Roberto Espenica; Pedro D. Medeiros

In this paper we present an implementation of PVM over the VIA architecture. Using VIA, the PVM communication primitives performance approaches the real hardware capabilities. As VIA is an industry standard for high performance communication on system area networks, this implementation runs on every VIA-conformant platform. The current PVM implementation is based on Berkeley Sockets. To ease the integration of VIA, a stream library (LSL - Lite Stream Library) was developed. LSL supplies a socket-like interface to VIA. LSL can be used directly by applications, thus improving their communication performance specially for small messages. Performance results obtained in the current prototype (Linux cluster using M-VIA) already show that LSL has some performance gains over the native socket interface, but is still open to enhancements.

ieee international conference on high performance computing data and analytics | 1998

The DOTPAR Project: Towards a Framework Supporting Domain Oriented Tools for Parallel and Distributed Processing

José C. Cunha; Pedro D. Medeiros; João Lourenço; Vítor Duarte; João Vieira; Bruno Moscão; Daniel Pereira; Rui Vaz

We discuss the problem of building domain oriented environments by a composition of heterogeneous application components and tools. We describe several individual tools that support such environments, namely a distributed monitoring and control tool (DAMS), a process-based distributed debugger (PDBG) and a heterogeneous interconnection model (PHIS). We discuss our experience with the development of a Problem Oriented Environment in the domain of genetic algorithms, obtained by a composition of heterogeneous tools and application components.

technical symposium on computer science education | 2005

Using a PC simulator to illustrate input-output programming techniques

Pedro D. Medeiros; Vítor Duarte; M. Cecilia Gomes; Rui F. Marques

We present our use of the Bochs PC emulator in a series of pratical assignments that, in a basic computer architecture course, introduce polling and interrupt-based input-output programming techniques.

Explore More