João Bispo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where João Bispo is active.

Explore More

Publication

Featured researches published by João Bispo.

field-programmable technology | 2006

Regular expression matching for reconfigurable packet inspection

João Bispo; Ioannis Sourdis; João M. P. Cardoso; Stamatis Vassiliadis

Recent intrusion detection systems (IDS) use regular expressions instead of static patterns as a more efficient way to represent hazardous packet payload contents. This paper focuses on regular expressions pattern matching engines implemented in reconfigurable hardware. A nondeterministic finite automata (NFA) based implementation was presented, which takes advantage of new basic building blocks to support more complex regular expressions than the previous approaches. The methodology is supported by a tool that automatically generates the circuitry for the given regular expressions, outputting VHDL representations ready for logic synthesis. Furthermore, techniques to reduce the area cost of our designs and maximize performance when targeting FPGAs were included. Experimental results show that our tool is able to generate a regular expression engine to match more than 500 IDS regular expressions (from the Snort ruleset) using only 25K logic cells and achieving 2 Gbps throughput on a Virtex2 and 2.9 on a Virtex4 device. Concerning the throughput per area required per matching non-meta character, our design is 3.4 and 10 times more efficient than previous ASIC and FPGA approaches, respectively

applied reconfigurable computing | 2007

Synthesis of regular expressions targeting FPGAs: current status and open issues

João Bispo; Ioannis Sourdis; João M. P. Cardoso; Stamatis Vassiliadis

This paper presents an overview regarding the synthesis of regular expressions targeting FPGAs. It describes current solutions and a number of open issues. Implementation of regular expressions can be very challenging when performance is critical. Software implementations may not be able to satisfy performance requirements and thus dedicated hardware engines have to be used. In the later case, automatic synthesis tools are of paramount importance to achieve fast prototyping of regular expression engines. As a case study, experimental results are presented, for FPGA implementations of the regular expressions included in the rule-set of a Network Intrusion Detection System (NIDS), Bleeding Edge, obtained using a state-of-the-art synthesis approach.

field-programmable technology | 2010

On identifying and optimizing instruction sequences for dynamic compilation

João Bispo; João M. P. Cardoso

Typical computing systems based on general purpose processors (GPPs) can be extended with coarse-grained reconfigurable arrays (CGRAs) to provide higher performance and/or energy savings. In order for applications to take advantage of these computing systems, possibly including CGRAs varying in size, efficient dynamic compilation/mapping techniques are required. Dynamic mapping will be responsible for automatically moving computations originally running in the GPP to the CGRA. This paper presents our approach to dynamically map computations to CGRAs coupled to a GPP. Specifically, we evaluate the potential of the MegaBlock to accelerate the execution of a number of representative benchmarks when targeting an architecture based on a GPP and a CGRA. In addition, we show the impact on performance when using constant folding and propagation optimizations.

computing frontiers | 2016

The ANTAREX approach to autotuning and adaptivity for energy efficient HPC systems

Cristina Silvano; Giovanni Agosta; Stefano Cherubin; Davide Gadioli; Gianluca Palermo; Andrea Bartolini; Luca Benini; Jan Martinovič; Martin Palkovic; Kateřina Slaninová; João Bispo; João M. P. Cardoso; Pedro Pinto; Carlo Cavazzoni; Nico Sanna; Andrea R. Beccari; Radim Cmar; Erven Rohou

The ANTAREX project aims at expressing the application self-adaptivity through a Domain Specific Language (DSL) and to runtime manage and autotune applications for green and heterogeneous High Performance Computing (HPC) systems up to Exascale. The DSL approach allows the definition of energy-efficiency, performance, and adaptivity strategies as well as their enforcement at runtime through application autotuning and resource and power management. We show through a mini-app extracted from one of the project application use cases some initial exploration of application precision tuning by means enabled by the DSL.

International Journal of Reconfigurable Computing | 2013

Transparent Runtime Migration of Loop-Based Traces of Processor Instructions to Reconfigurable Processing Units

João Bispo; Nuno Miguel Cardanha Paulino; João M. P. Cardoso; João Canas Ferreira

The ability to map instructions running in a microprocessor to a reconfigurable processing unit (RPU), acting as a coprocessor, enables the runtime acceleration of applications and ensures code and possibly performance portability. In this work, we focus on the mapping of loop-based instruction traces (called Megablocks) to RPUs. The proposed approach considers offline partitioning and mapping stages without ignoring their future runtime applicability. We present a toolchain that automatically extracts specific trace-based loops, called Megablocks, from MicroBlaze instruction traces and generates an RPU for executing those loops. Our hardware infrastructure is able to move loop execution from the microprocessor to the RPU transparently, at runtime, and without changing the executable binaries. The toolchain and the system are fully operational. Three FPGA implementations of the system, differing in the hardware interfaces used, were tested and evaluated with a set of 15 application kernels. Speedups ranging from 1.26 to 3.69 were achieved for the best alternative using a MicroBlaze processor with local memory.

field-programmable logic and applications | 2010

On Identifying Segments of Traces for Dynamic Compilation

João Bispo; João M. P. Cardoso

Typical computing systems based on general purpose processors (GPPs) are extended with coarse-grained reconfigurable arrays (CGRAs) to provide higher performance and/or energy savings. In order for applications to take advantage of these computing systems, efficient dynamic mapping techniques are required. Those dynamic mapping techniques will be responsible for automatically moving computations originally running in the GPP to the CGRA. The concept of dynamic compilation, widespread in the context of JIT compilation to GPPs, is receiving more attention by there configurable computing community. This paper presents our approach to dynamically map computations to CGRAs coupled to a GPP. Specifically, we present the identification of large sequences of instructions, MegaBlocks, being executed in a GPP. These MegaBlocks are then mapped to the target CGRA. We evaluate the potential of the MegaBlocks over Basic Blocks and Super Blocks to increase the IPC when targeting a CGRA and considering the execution of a number of representative benchmarks.

reconfigurable computing and fpgas | 2011

From Instruction Traces to Specialized Reconfigurable Arrays

João Bispo; Nuno Miguel Cardanha Paulino; João M. P. Cardoso; João Canas Ferreira

This paper presents an offline tool-chain which automatically extracts loops (Mega blocks) from Micro Blaze instruction traces and creates a tailored Reconfigurable Processing Unit (RPU) for those loops. The system moves loops from the CPU to the RPU transparently, at runtime, and without changing the executable binaries. The system was implemented in an FPGA and for the tested kernels measured speedups ranged between 3.9x and 18.2x for a Micro Blaze CPU without cache. We estimate speedups from 1.03x to 2.01x, when comparing to the best estimated performance achieved with a single Micro Blaze.

design, automation, and test in europe | 2015

Transparent acceleration of program execution using reconfigurable hardware

Nuno Miguel Cardanha Paulino; João Canas Ferreira; João Bispo; João M. P. Cardoso

The acceleration of applications, running on a general purpose processor (GPP), by mapping parts of their execution to reconfigurable hardware is an approach which does not involve programs source code and still ensures program portability over different target reconfigurable fabrics. However, the problem is very challenging, as suitable sequences of GPP instructions need to be translated/mapped to hardware, possibly at runtime. Thus, all mapping steps, from compiler analysis and optimizations to hardware generation, need to be both efficient and fast. This paper introduces some of the most representative approaches for binary acceleration using reconfigurable hardware, and presents our binary acceleration approach and the latest results. Our approach extends a GPP with a Reconfigurable Processing Unit (RPU), both sharing the data memory. Repeating sequences of GPP instructions are migrated to an RPU composed of functional units and interconnect resources, and able to exploit instruction-level parallelism, e.g., via loop pipelining. Although we envision a fully dynamic system, currently the RPU resources are selected and organized offline using execution trace information. We present implementation prototypes of the system on a Spartan-6 FPGA with a MicroBlaze as GPP and the very encouraging results achieved with a number of benchmarks.

Proceedings of ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming | 2014

Multi-Target C Code Generation from MATLAB

João Bispo; Luís Paulo Reis; João M. P. Cardoso

This paper describes our recent work on MATISSE, a framework for MATLAB to C compilation. We focus on the new optimizations and transformations, as well as on OpenCL generation. MATISSE is controlled with LARA, an aspect-oriented language, able to specify transformations to the input MATLAB code (e.g., insertion of code for variable initialization and for monitoring) and to express information concerning types and shapes of variables. We evaluate the compiler with a set of benchmarks when targeting both an embedded system and a desktop system. The results show that we were able to achieve a speedup up to 1.8× by employing information provided by LARA aspects. We also compare the execution time of the generated C code with the original code running on MATLAB, and we achieve a geometric mean speedup of 19×. The geometric mean speedup reduces to 12× when optimizing the MATLAB code with LARA aspects. Finally, we present a preliminary version of a fully-functioning pragma-based OpenCL generator, built over the MATISSE framework.

International Journal of Electronics | 2008

Synthesis of regular expressions for FPGAs

João Bispo; João M. P. Cardoso

Regular expressions are being used in many applications to specify multiple and complex text patterns in a compact way. In some of these applications large sets of regular expressions need to be evaluated to detect matched content. Specialised hardware engines are employed when software-based regular expression engines are not able to meet the performance requirements imposed by such applications. Since the sets of regular expressions are periodically modified and/or extended, FPGAs are an attractive hardware solution to achieve both programmability and high-performance demands. However, efficient automatic synthesis tools are of paramount importance to achieve fast prototyping of regular expression engines on these devices. This paper presents an overview of the synthesis of regular expressions with the aim of achieving high-performance engines for FPGAs. We focus on describing current solutions, proposing new solutions for constraint repetitions and overlapped matching, and discussing a number of challenges and open issues. As a case study, we present FPGA implementations of the regular expressions included in two rule-sets of network intrusion detection system (NIDS), Bleeding Edge and Snort, obtained using a state-of-the-art synthesis approach.

Explore More