Is this you? Create Your Porfile

Román Hermida

Complutense University of Madrid

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Román Hermida is active.

Explore More

Publication

Featured researches published by Román Hermida.

design, automation, and test in europe | 2005

A Complete Network-On-Chip Emulation Framework

N. Genko; David Atienza; G. De Micheli; José M. Mendías; Román Hermida; Francky Catthoor

Current systems-on-chip (SoC) execute applications that demand extensive parallel processing. Networks-on-chip (NoC) provide a structured way of realizing interconnections on silicon, and obviate the limitations of bus-based solutions. NoC can have regular or ad hoc topologies, and functional validation is essential to assess their correctness and performance. In this paper, we present a flexible emulation environment implemented on an FPGA that is suitable to explore, evaluate and compare a wide range of NoC solutions with a very limited effort. Our experimental results show a speed-up of four orders of magnitude with respect to cycle-accurate HDL simulation, while retaining cycle accuracy. With our emulation framework, designers can explore and optimize a various range of solutions, as well as characterize quickly performance figures.

IEEE Transactions on Very Large Scale Integration Systems | 2001

A framework for reconfigurable computing: task scheduling and context management

Rafael Maestre; Fadi J. Kurdahi; Milagros Fernández; Román Hermida; Nader Bagherzadeh; Hartej Singh

Dynamically reconfigurable architectures are emerging as a viable design alternative to implement a wide range of computationally intensive applications. At the same time, an urgent necessity has arisen for support tool development to automate the design process and achieve optimal exploitation of the architectural features of the system. Task scheduling and context (configuration) management become very critical issues in achieving the high performance that digital signal processing (DSP) and multimedia applications demand. This article proposes a strategy to automate the design process which considers all possible optimizations that can be carried out at compilation time, regarding context and data transfers. This strategy is general in nature and could be applied to different reconfigurable systems. We also discuss the key aspects of the scheduling problem in a reconfigurable architecture such as MorphoSys. In particular, we focus on a task scheduling methodology for DSP and multimedia applications, as well as the context management and scheduling optimizations.

design, automation, and test in europe | 1999

Kernel scheduling in reconfigurable computing

Rafael Maestre; Fadi J. Kurdahi; Nader Bagherzadeh; Hartej Singh; Román Hermida; Milagros Fernández

Reconfigurable computing is a flexible way of facing with a single device a wide range of applications with a good level of performance. This area of computing involves different issues and concepts when compared with conventional computing systems. One of these concepts is context lending. The context refers to the coded configuration information to implement a particular circuit behaviour. An important problem for reconfigurable computing is the scheduling of a group of kernels (sub-tasks) that constitute a complex application for minimum execution time. In this paper, we show how the different execution orders for these sub-tasks may result in varying levels of performance. We formulate an analytical approach and present a solution for this new problem through this work.

ACM Transactions on Design Automation of Electronic Systems | 2004

Annealing placement by thermodynamic combinatorial optimization

Juan de Vicente; Juan Lanchares; Román Hermida

Placement is key issue of integrated circuit physical design. There exist some techniques inspired in thermodynamics coping with this problem as Simulated Annealing. In this article, we present a combinatorial optimization method directly derived from both Thermodynamics and Information Theory. In TCO (Thermodynamic Combinatorial Optimization), two kinds of processes are considered: microstate and macrostate transformations. Applying the Shannons definition of entropy to reversible microstate transformations, a probability of acceptance based on Fermi--Dirac statistics is derived. On the other hand, applying thermodynamic laws to macrostate transformations, an efficient annealing schedule is provided. TCO has been compared with a custom Simulated Annealing (SA) tool on a set of benchmark circuits for the FPGA (Field Programmable Gate Arrays) placement problem. TCO has provided the high-quality results of SA, while inheriting the adaptive properties of Natural Optimization (NO).

ACM Transactions on Design Automation of Electronic Systems | 2007

HW-SW emulation framework for temperature-aware design in MPSoCs

David Atienza; Pablo García Del Valle; Giacomo Paci; Francesco Poletti; Luca Benini; Giovanni De Micheli; José M. Mendías; Román Hermida

New tendencies envisage multiprocessor systems-on-chips (MPSoCs) as a promising solution for the consumer electronics market. MPSoCs are complex to design, as they must execute multiple applications (games, video) while meeting additional design constraints (energy consumption, time-to-market). Moreover, the rise of temperature in the die for MPSoCs can seriously affect their final performance and reliability. In this article, we present a new hardware-software emulation framework that allows designers a complete exploration of the thermal behavior of final MPSoC designs early in the design flow. The proposed framework uses FPGA emulation as the key element to model hardware components of the considered MPSoC platform at multimegahertz speeds. It automatically extracts detailed system statistics that are used as input to our software thermal library running in a host computer. This library calculates at runtime the temperature of on-chip components, based on the collected statistics from the emulated system and final floorplan of the MPSoC. This enables fast testing of various thermal management techniques. Our results show speedups of three orders of magnitude compared to cycle-accurate MPSoC simulators.

Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204) | 1998

RSR: a new rectilinear Steiner minimum tree approximation for FPGA placement and global routing

J. de Vincente; Juan Lanchares; Román Hermida

The work combines FPGA placement and global routing phases in a single phase, taking advantage of the interrelations between them both. The authors have developed rectilinear Steiner regions (RSR), a new fast algorithm to approximate the rectilinear Steiner minimum tree (RSMT) of each multi-terminal net. The search of placement solutions is performed in three simulated annealing optimization phases, guided by different objective functions. The first one uses a semi-perimeter classic metric to reduce the length of the nets. The second one estimates more precisely the length of the nets with RSR algorithm. The third stage measures the congestion making a fast routing of RSR regions in each placement iteration. They have also developed an RSR-based global router. This optimization method has been applied for the placement and global routing of a set of benchmark circuits. The layouts obtained, require equal or fewer routing tracks per channel segment than those produced by other tools appeared in the literature, that only optimize the semi-perimeter classic placement cost function.

international symposium on circuits and systems | 2005

A novel approach for network on chip emulation

N. Genko; David Atienza; G. De Micheli; Luca Benini; José M. Mendías; Román Hermida; F. Catthoor

Current systems-on-chip execute applications that demand extensive parallel processing. Networks-on-chip (NoC) provide a structured way of realizing interconnections on silicon, and obviate the limitations of bus-based solutions. NoCs can have regular or ad hoc topologies, and functional validation is essential to assess their correctness and performance. In this paper, we present a flexible emulation environment implemented on an FPGA that is suitable to explore, evaluate and compare a wide range of NoC solutions with a very limited effort. Our experimental results show a speed-up of four orders of magnitude with respect to cycle-accurate HDL simulation, while retaining cycle accuracy. With our emulation framework, designers can explore and optimize a range of solutions, as well as characterize quickly performance figures.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2006

Bitwise scheduling to balance the computational cost of behavioral specifications

María Molina; Rafael Ruiz-Sautua; José M. Mendías; Román Hermida

Conventional scheduling algorithms try to balance the number of operations of every different type executed per cycle. However, in most cases, a uniform distribution is not reachable, and thus, some hardware (HW) waste appears. This situation becomes worse when heterogeneous specifications (those formed by operations with different data formats and widths) are synthesized. Our proposal is an innovative bit-level algorithm able to minimize this HW waste. In order to obtain uniform distributions of the computational cost of operations among cycles, it successively transforms specification operations into sets of smaller ones, which are then scheduled independently. As a consequence, some specification operations may be executed during a set of nonconsecutive cycles, and over several functional units. In combination with allocation algorithms able to guarantee the bit-level reuse of HW resources, our approach produces circuits with substantially smaller area than conventional implementations. Due to the fragmentation of operations, in the proposed implementations, the type, number, and width of HW resources are, in general, independent of the type, number, and width of both specification operations and variables. Additionally, the clock-cycle length is also reduced in most circuits.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011

A Distributed Controller for Managing Speculative Functional Units in High Level Synthesis

A. A. Del Barrio; Seda Ogrenci Memik; María Molina; José M. Mendías; Román Hermida

Speculative functional units (SFUs) are arithmetic functional units that operate using a predictor for the carry signal. The carry prediction helps to shorten the critical path of the functional unit. The average case performance of these units is determined by the hit rate of the prediction. In case of mispredictions, the SFUs need to be coordinated by the datapath control mechanism to perform corrections and to maintain the datapath in the correct state. Devising a control mechanism for correcting mispredictions without adversely impacting overall performance is the most important challenge. In this paper, we present techniques for designing a datapath controller for seamless deployment of SFUs in high level synthesis. We have developed two techniques based on two main control paradigms: centralized and distributed control. The centralized approach stops the execution of the entire datapath for each misprediction and resumes execution once the correct value of the carry is known. The distributed approach decouples the functional unit suffering from the misprediction from the rest of the datapath. Hence, it allows the remainder of the functional units to carry on execution and be at different scheduling states at different times. We tested datapaths utilizing both linear structures and logarithmic structures for speculative arithmetic functional units. Our results show that it is possible to reduce execution time by as much as 38% (33% on average) for linear structures and by as much as 37.2% (25% on average) for logarithmic structures.

international symposium on systems synthesis | 1999

A framework for scheduling and context allocation in reconfigurable computing

Rafael Maestre; Milagros Fernández; Román Hermida; Nader Bagherzadeh

Reconfigurable computing is emerging as a viable design alternative to implement a wide range of computationally intensive applications. The scheduling problem becomes a really critical issue in achieving the high performance that these kind of applications demand. The paper describes the different aspects regarding the scheduling problem in a reconfigurable architecture. We also propose a general strategy in order to perform at compilation time a scheduling that includes all possible optimizations regarding context (configuration) and data transfers. In particular, we focus especially on the methodology and mechanisms to solve the context scheduling. Some experimental results are presented to validate our assumptions. Finally, the problem of data transfers is formulated, to be addressed in future work.

Explore More