Sebastian Isaza
Delft University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sebastian Isaza.
international symposium on microarchitecture | 2010
Alex Ramirez; Felipe Cabarcas; Ben H. H. Juurlink; Mauricio Alvarez Mesa; Friman Sánchez; Arnaldo Azevedo; Cor Meenderinck; Catalin Bogdan Ciobanu; Sebastian Isaza; Gerogi Gaydadjiev
The SARC architecture is composed of multiple processor types and a set of user-managed direct memory access (DMA) engines that let the runtime scheduler overlap data transfer and computation. The runtime system automatically allocates tasks on the heterogeneous cores and schedules the data transfers through the DMA engines. SARCs programming model supports various highly parallel applications, with matching support from specialized accelerator processors.
field programmable gate arrays | 2014
Georgios Smaragdos; Sebastian Isaza; Martijn F. van Eijk; Ioannis Sourdis; Christos Strydis
The Inferior-Olivary nucleus (ION) is a well-charted region of the brain, heavily associated with sensorimotor control of the body. It comprises ION cells with unique properties which facilitate sensory processing and motor-learning skills. Various simulation models of ION-cell networks have been written in an attempt to unravel their mysteries. However, simulations become rapidly intractable when biophysically plausible models and meaningful network sizes (>=100 cells) are modeled. To overcome this problem, in this work we port a highly detailed ION cell network model, originally coded in Matlab, onto an FPGA chip. It was first converted to ANSI C code and extensively profiled. It was, then, translated to HLS C code for the Xilinx Vivado toolflow and various algorithmic and arithmetic optimizations were applied. The design was implemented in a Virtex 7 (XC7VX485T) device and can simulate a 96-cell network at real-time speed, yielding a speedup of x700 compared to the original Matlab code and x12.5 compared to the reference C implementation running on a Intel Xeon 2.66GHz machine with 20GB RAM. For a 1,056-cell network (non-real-time), an FPGA speedup of x45 against the C code can be achieved, demonstrating the designs usefulness in accelerating neuroscience research. Limited by the available on-chip memory, the FPGA can maximally support a 14,400-cell network (non-real-time) with online parameter configurability for cell state and network size. The maximum throughput of the FPGA ION-network accelerator can reach 2.13 GFLOPS.
Microprocessors and Microsystems | 2013
Ioannis Sourdis; Christos Strydis; Antonino Armato; Christos-Savvas Bouganis; Babak Falsafi; Georgi Gaydadjiev; Sebastian Isaza; Alirad Malek; R. Mariani; Dionisios N. Pnevmatikatos; Dhiraj K. Pradhan; Gerard K. Rauwerda; Robert M. Seepers; Rishad Ahmed Shafik; Kim Sunesen; Dimitris Theodoropoulos; Stavros Tzilis; Michalis Vavouras
The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect-/fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints.
complex, intelligent and software intensive systems | 2010
Sebastian Isaza; Friman Sánchez; Georgi Gaydadjiev; Alex Ramirez; Mateo Valero
Sequence alignment is a fundamental instrument in Bioinformatics. In recent years, numerous proposals have been addressing the problem of accelerating this class of applications. This, due to the rapid growth of sequence databases in combination with the high computational demands imposed by the algorithms. In this paper we focus on the analysis of the progressive alignment in ClustalW, a widely used program for performing multiple sequence alignment. We have parallelized ClustalW for the Cell processor architecture and have carefully analyzed the scalability of its different phases with both the number of cores used and the input size. Experimental results show that computing profile scores scales well up to 16 SPE cores. With the increase of the input size, profiles initialization in the PPE core becomes the predominant bottleneck.
international conference / workshop on embedded computer systems: architectures, modeling and simulation | 2008
Sebastian Isaza; Friman Sánchez; Georgi Gaydadjiev; Alex Ramirez; Mateo Valero
The fast growth of bioinformatics field has attracted the attention of computer scientists in the last few years. At the same time the increasing database sizes require greater efforts to improve the computational performance. From a computer architecture point of view, we intend to investigate how bioinformatics applications can benefit from future multi-core processors. In this paper we present a preliminary study of the Cell BE processor limitations when executing two representative sequence alignment applications (Ssearch and ClustalW). The inherent large parallelism of the targeted algorithms makes them ideal for architectures supporting multiple dimensions of parallelism (TLP and DLP). However, in the case of Cell BE we identified several architectural limitations that need a careful study and quantification.
computing frontiers | 2011
Sebastian Isaza; Friman Sánchez; Felipe Cabarcas; Alex Ramirez; Georgi Gaydadjiev
Sequence alignment is one of the fundamental tasks in bioinformatics. Due to the exponential growth of biological data and the computational complexity of the algorithms used, high performance computing systems are required. Although multicore architectures have the potential of exploiting the task-level parallelism found in these workloads, efficiently harnessing systems with hundreds of cores requires deep understanding of the applications and the architecture. When incorporating large numbers of cores, performance scalability will likely saturate shared hardware resources like buses and memories. In this paper we evaluate the performance impact of various configurations of an accelerator-based multicore architecture with the aim of revealing and quantifying the bottlenecks. Then, we compare against a multicore using general-purpose processors and discuss the performance gap. Our target application is ClustalW, one of the most popular programs for Multiple Sequence Alignment. Different input data sets are characterized and we show how they influence performance. Simulation results show that due to the high computation-to-communication ratio and the transfer of data in large chunks, memory latency is well tolerated. However, bandwidth is critical to achieving maximum performance. Using a 32KB cache configuration with 4 banks can capture most of the memory traffic and therefore avoid expensive off-chip transactions. On the other hand, using a hardware queue for the tasks synchronization allows us to handle a large number of cores. Finally, we show that using a simple load balancing strategy, we can increase performance of general-purpose cores by 28%.
digital systems design | 2011
Sebastian Isaza; Ernst Joachim Houtgast; Georgi Gaydadjiev
Exponential growth in biological sequence data combined with the computationally intensive nature of bioinformatics applications results in a continuously rising demand for processing power. In this paper, we propose a performance model that captures the behavior and performance scalability of HMMER, a bioinformatics application that identifies similarities between protein sequences and a protein family model. With our analytical model, the optimal master-worker ratio for a user scenario can be estimated. The model is evaluated and is found accurate with less than 2% error. We applied our model to a widely used heterogeneous multicore, the Cell BE, using the PPE and SPEs as master and workers respectively. Experimental results show that for the current parallelization strategy, the I/O speed at which the database is read from disk and the inputs pre-processing are the two most limiting factors in the Cell BE case.
applied reconfigurable computing | 2014
Ioannis Sourdis; Christos Strydis; Antonino Armato; Christos-Savvas Bouganis; Babak Falsafi; Georgi Gaydadjiev; Sebastian Isaza; Alirad Malek; R. Mariani; Samuel Pagliarini; Dionisios N. Pnevmatikatos; Dhiraj K. Pradhan; Gerard K. Rauwerda; Robert M. Seepers; Rishad Ahmed Shafik; Georgios Smaragdos; Dimitris Theodoropoulos; Stavros Tzilis; Michalis Vavouras
The DeSyRe project builds on-demand adaptive, reliable Systems-on-Chips. In response to the current semiconductor technology trends thatmake chips becoming less reliable, DeSyRe describes a newgeneration of by design reliable systems, at a reduced power and performance cost. This is achieved through the following main contributions. DeSyRe defines a fault-tolerant system architecture built out of unreliable components, rather than aiming at totally fault-free and hence more costly chips. In addition, DeSyRe systems are on-demand adaptive to various types and densities of faults, as well as to other system constraints and application requirements. For leveraging on-demand adaptation/customization and reliability at reduced cost, a new dynamically reconfigurable substrate is designed and combined with runtime system software support. The above define a generic and repeatable design framework, which is applied to two medical SoCs with high reliability constraints and diverse performance and power requirements. One of the main goals of the DeSyRe project is to increase the availability of SoC components in the presence of permanents faults, caused at manufacturing time or due to device aging. A mix of coarse- and fine-grain reconfigurable hardware substrate is designed to isolate and bypass faulty component parts. The flexibility provided by the DeSyRe reconfigurable substrate is exploited at runtime by system optimization heuristics,which decide tomodify component configurationwhen a permanent fault is detected, providing graceful degradation.
The Journal of Supercomputing | 2016
Aníbal Guerra; Jaime Lotero; Sebastian Isaza
complex, intelligent and software intensive systems | 2011
Sebastian Isaza; Ernst Joachim Houtgast; Friman Sánchez; Alex Ramirez; Georgi Gaydadjiev