Nuno Sebastião | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nuno Sebastião is active.

Explore More

Publication

Featured researches published by Nuno Sebastião.

application specific systems architectures and processors | 2013

BioBlaze: Multi-core SIMD ASIP for DNA sequence alignment

Nuno Neves; Nuno Sebastião; Andre Patricio; David Martins de Matos; Pedro Tomás; Paulo F. Flores; Nuno Roma

A new Application-Specific Instruction-set Processor (ASIP) architecture for biological sequences alignment is proposed in this manuscript. This architecture achieves high processing throughputs by exploiting both fine and coarse-grained parallelism. The former is achieved by extending the Instruction Set Architecture (ISA) of a synthesizable processor to include multiple specialized SIMD instructions that implement vector-vector and vector-scalar arithmetic, logic, load/store and control operations. Coarse-grained parallelism is achieved by using multiple cores to cooperatively align multiple sequences in a shared memory architecture, comprising proper hardware-specific synchronization mechanisms. To ease the programming, a compilation framework based on an adaptation of the GCC back-end was also implemented. The proposed system was prototyped and evaluated on a Xilinx Virtex-7 FPGA, achieving a 200MHz working frequency. A sequential and a state-of-theart SIMD implementations of the Smith-Waterman algorithm were programmed in both the proposed ASIP and an Intel Core i7 processor. When comparing the achieved speedups, it was observed that the proposed ISA achieves a 40x speedup, which contrasts with the 11x speedup provided by SSE2 in the Intel Core i7 processor. The scalability of the multi-core system was also evaluated and proved to scale almost linearly with the number of cores.

IEEE Transactions on Very Large Scale Integration Systems | 2015

Multicore SIMD ASIP for Next-Generation Sequencing and Alignment Biochip Platforms

Nuno Neves; Nuno Sebastião; David Martins de Matos; Pedro Tomás; Paulo F. Flores; Nuno Roma

Targeting the development of new biochip platforms capable of autonomously sequencing and aligning biological sequences, a new multicore processing structure is proposed in this manuscript. This multicore structure makes use of a shared memory model and multiple instantiations of a novel application-specific instruction-set processor (ASIP) to simultaneously exploit both fine and coarse-grained parallelism and to achieve high performance levels at low-power consumption. The proposed ASIP is built by extending the instruction set architecture of a synthesizable processor, including both general and special-purpose single-instruction multiple-data instructions. This allows an efficient exploitation of fine-grained parallelism on the alignment of biological sequences, achieving over 30× speedup when compared with sequential algorithmic implementations. The complete system was prototyped on different field-programmable gate array platforms and synthesized with a 90-nm CMOS process technology. Experimental results demonstrate that the multicore structure scales almost linearly with the number of instantiated cores, achieving performances similar to a quad-core Intel Core i7 3820 processor, while using 25× less energy.

IEEE Transactions on Very Large Scale Integration Systems | 2012

Integrated Hardware Architecture for Efficient Computation of the

Nuno Sebastião; Nuno Roma; Paulo F. Flores

A flexible hardware architecture that implements a set of new and efficient techniques to significantly reduce the computational requirements of the commonly used Smith-Waterman sequence alignment algorithm is presented. Such innovative techniques use information gathered by the hardware accelerator during the computation of the alignment scores to constrain the size of the subsequence that has to be post-processed in the traceback phase using a general purpose processor (GPP). Moreover, the proposed structure is also capable of computing the n-best local alignments according to the Waterman-Eggert algorithm, becoming the first hardware architecture that is able to simultaneously evaluate the n-best alignments of a given sequence pair, by incorporating a set of ordering units that work in parallel with the systolic array. A complete alignment system was developed and implemented in a Virtex-4 FPGA, by integrating the proposed accelerator architecture with a Leon3 GPP. The obtained experimental results demonstrate that the proposed system is flexible and allows the alignment of large sequences in memory constrained systems. As an example, a speedup of 17 was obtained with the conceived system when compared with a regular implementation of the LALIGN35 program running on an Intel Core2 Duo processor running at a 40 × higher frequency.

international conference on high performance computing and simulation | 2011

n

Gustavo Encarnação; Nuno Sebastião; Nuno Roma

A comparative analysis of high-performance implementations of two state of the art index structures that are of particular interest in the field of bioinformatics applications to accelerate the alignment of DNA sequences is presented. The two indexes are based on suffix trees and suffix arrays and were implemented in two different platforms: a quad-core CPU and a NVIDIA GeForce GTX 580 GPU, based on the newest Fermi architecture. Unlike what happens in conventional CPU implementations, the obtained experimental results reveal that GPU implementations clearly favor the suffix arrays, due to the achieved performance in terms of memory accesses. When compared with the CPU, the results demonstrate the possibility to achieve speedups as high as 85 when using the suffix array in the GPU, thus making it an adequate choice for high-performance bioin-fomatics applications.

Concurrency and Computation: Practice and Experience | 2015

-Best Bio-Sequence Local Alignments in Embedded Platforms

Nuno Sebastião; Gustavo Encarnação; Nuno Roma

Because of the large datasets that are usually involved in deoxyribonucleic acid (DNA) sequence alignment, the use of optimal local alignment algorithms (e.g., Smith–Waterman) is often unfeasible in practical applications. As such, more efficient solutions that rely on indexed search procedures are often preferred to significantly reduce the time to obtain such alignments. Some data structures that are usually adopted to build such indexes are suffix trees, suffix arrays, and the hash tables of q‐mers.

international conference on high performance computing and simulation | 2010

Advantages and GPU implementation of high-performance indexed DNA search based on suffix arrays

Nuno Sebastião; Tiago Dias; Nuno Roma; Paulo F. Flores

Dynamic programming algorithms are widely used to find the optimal sequence alignment between any two DNA sequences. This paper presents an innovative technique to significantly reduce the computation time and memory space requirements of the traceback phase of the Smith-Waterman algorithm, together with a flexible and scalable hardware architecture to accelerate the overall procedure. The results obtained from an implementation using a Virtex-4 FPGA showed that the proposed technique is feasible and is able to provide a significant speedup. For the considered test sequences, a speedup of about 6000 was obtained.

Concurrency and Computation: Practice and Experience | 2013

Implementation and performance analysis of efficient index structures for DNA search algorithms in parallel platforms

Nuno Sebastião; Nuno Roma; Paulo F. Flores

A new class of efficient and flexible hardware accelerators for DNA local sequence alignment based on the widely used Smith–Waterman algorithm is proposed in this paper. This new class of accelerating structures exploits an innovative technique that tracks the origin coordinates of the best alignment to allow a significant reduction of the size of the dynamic programming matrix that needs to be recomputed during the subsequent traceback phase, providing a considerable reduction of the resulting time and memory requirements. The significant performance of the enhanced class of accelerators is attained by also providing support for an additional level of parallelism: the capability to concurrently align several query sequences with one or more reference sequences, according to the specific application requisites. Moreover, the accelerator class also includes specially designed processing elements that improve the resource usage when implemented in a Field Programmable Gate Array (FPGA), and easily provide several different configurations in an Application Specific Integrated Circuit (ASIC) implementation. Obtained results demonstrated that speedups as high as 278 can be obtained in ASIC accelerating structures. A FPGA‐based prototyping platform, operating at a 40 times lower clock frequency and incorporating a complete alignment embedded system, still provides significant speedups as high as 27, compared with a pure software implementation.Copyright

Microprocessors and Microsystems | 2012

Integrated accelerator architecture for DNA sequences alignment with enhanced traceback phase

Nuno Sebastião; Nuno Roma; Paulo F. Flores

Dynamic programming algorithms are widely used to find the optimal sequence alignment between any two DNA sequences. This manuscript presents a new, flexible and scalable hardware accelerator architecture to speedup the implementation of the frequently used Smith-Waterman algorithm. When integrated with a general purpose processor, the developed accelerator significantly reduces the computation time and memory space requirements of alignment tasks. Such efficiency mainly comes from two innovative techniques that are proposed. First, the usage of the maximum score cell coordinates, gathered during the computation of the alignment scores in the matrix-fill phase, in order to significantly reduce the time and memory requirements of the traceback phase. Second, the exploitation of an additional level of parallelism in order to simultaneously align several query sequences with the same reference sequence, targeting the processing of short-read DNA sequences. The results obtained from the implementation of a complete alignment system based on the new accelerator architecture in a Virtex-4 FPGA showed that the proposed techniques are feasible and the developed accelerator is able to provide speedups as high as 16 for the considered test sequences. Moreover, it was also shown that the proposed approach allows the processing of larger DNA sequences in memory restricted environments.

digital systems design | 2008

Configurable and scalable class of high performance hardware accelerators for simultaneous DNA sequence alignment

Nuno Sebastião; Tiago Dias; Nuno Roma; Paulo F. Flores; Leonel Sousa

The implementation of a recently proposed IP core of an efficient motion estimation co-processor is considered. Some significant functional improvements to the base architecture are proposed, as well as the presentation of a detailed description of the interfacing between the co-processor and the main processing unit of the video encoding system. Then, a performance analysis of two distinct implementations of this IP core is presented, considering two different target technologies: a high performance FPGA device, from the Xilinx Virtex-II Pro family, and an ASIC based implementation, using a 0.18 mum CMOS StdCell library. Experimental results have shown that the two alternative implementations have quite similar performance levels and allow the estimation of motion vectors in real-time.

Archive | 2010