Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Radu Marculescu is active.

Publication


Featured researches published by Radu Marculescu.


asia and south pacific design automation conference | 2003

Energy-aware mapping for tile-based NoC architectures under performance constraints

Jingcao Hu; Radu Marculescu

In this paper, we present an algorithm which automatically maps the IPs/cores onto a generic regular Network on Chip (NoC) architecture such that the total communication energy is minimized. At the same time, the performance of the mapped system is guaranteed to satisfy the specified constraints through bandwidth reservation. As the main contribution, we first formulate the problem of energy-aware mapping, in a topological sense, and then propose an efficient branch-and-bound algorithm to solve it. Experimental results show that the proposed algorithm is very fast and robust, and significant energy savings can be achieved. For instance, for a complex video/audio SoC design, on average, 60.4% energy savings have been observed compared to an ad-hoc implementation.


design, automation, and test in europe | 2003

Exploiting the Routing Flexibility for Energy/Performance Aware Mapping of Regular NoC Architectures

Jingcao Hu; Radu Marculescu

In this paper, we present an algorithm which automatically maps the IPs onto a generic regular Network on Chip (NoC) architecture and constructs a deadlock-free deterministic routing function such that the total communication energy is minimized. At the same time, the performance of the resulting communication system is guaranteed to satisfy the specified constraints through bandwidth reservation. As the main contribution, we first formulate the problem of energy/performance aware mapping, in a topological sense, and show how the routing flexibility can be exploited to expand the solution space and improve the solution quality. An efficient branch-and-bound algorithm is then described to solve this problem. Experimental results show that the proposed algorithm is very fast, and significant energy savings can be achieved. For instance, for a complex video/audio application, 51.7% energy savings have been observed, on average, compared to an ad-hoc implementation.


design, automation, and test in europe | 2004

Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints

Jingcao Hu; Radu Marculescu

In this paper, we present a novel energy-aware scheduling (EAS) algorithm which statically schedules both communication transactions and computation tasks onto heterogeneous network-on-chip (NoC) architectures under real-time constraints. Our algorithm automatically assigns tasks onto different processing elements and then schedules their execution. At the same time, the algorithm also takes into consideration the exact communication delay by scheduling communication transactions in parallel. As the main contribution, we first formulate the problem of concurrent communication and task scheduling for heterogeneous NoC architectures and then propose an efficient heuristic to solve it. Experimental results show that significant energy savings can be achieved by using our energy-aware scheduler while meeting the specified performance constraints. For instance, for a complex multimedia application, 44% energy savings have been observed, on average, compared to the schedules generated by a standard earliest-deadline-first scheduler.


ACM Transactions on Design Automation of Electronic Systems | 2007

On-chip communication architecture exploration: A quantitative evaluation of point-to-point, bus, and network-on-chip approaches

Hyung Gyu Lee; Naehyuck Chang; Umit Y. Ogras; Radu Marculescu

Traditionally, design-space exploration for systems-on-chip (SoCs) has focused on the computational aspects of the problem at hand. However, as the number of components on a single chip and their performance continue to increase, a shift from computation-based to communication-based design becomes mandatory. As a result, the communication architecture plays a major role in the area, performance, and energy consumption of the overall system. This article presents a comprehensive evaluation of three on-chip communication architectures targeting multimedia applications. Specifically, we compare and contrast the network-on-chip (NoC) with point-to-point (P2P) and bus-based communication architectures in terms of area, performance, and energy consumption. As the main contribution, we present complete P2P, bus-, and NoC-based implementations of a real multimedia application (i. e. the MPEG-2 encoder), and provide direct measurements using an FPGA prototype and actual video clips, rather than simulation and synthetic workloads. We also support the experimental findings through a theoretical analysis. Both experimental and analysis results show that the NoC architecture scales very well in terms of area, performance, energy, and design effort, while the P2P and bus-based architectures scale poorly on all accounts except for performance and area, respectively.


IEEE Transactions on Very Large Scale Integration Systems | 2004

On-chip traffic modeling and synthesis for MPEG-2 video applications

Girish Varatkar; Radu Marculescu

The objective of this paper is to introduce self-similarity as a fundamental property exhibited by the bursty traffic between on-chip modules in typical MPEG-2 video applications. Statistical tests performed on relevant traces extracted from common video clips establish unequivocally the existence of self-similarity in video traffic. Using a generic tile-based communication architecture, we discuss the implications of our findings on on-chip buffer space allocation and present quantitative evaluations for typical video streams. We also describe a technique for synthetically generating traces having statistical properties similar to those obtained from real video clips. Our proposed technique speeds up buffer simulations, allows media system designers to explore architectures rapidly and use large media data benchmarks more efficiently. We believe that our findings open new directions of research with deep implications on some fundamental issues in on-chip networks design for multimedia applications.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2006

System-Level Buffer Allocation for Application-Specific Networks-on-Chip Router Design

Jingcao Hu; Umit Y. Ogras; Radu Marculescu

In this paper, a novel system-level buffer planning algorithm that can be used to customize the router design in networks-on-chip (NoCs) is presented. More precisely, given the traffic characteristics of the target application and the total budget of the available buffering space, the proposed algorithm automatically assigns the buffer depth for each input channel, in different routers across the chip, such that the overall performance is maximized. This is in deep contrast with the uniform assignment of buffering resources (currently used in NoC design), which can significantly degrade the overall system performance. Indeed, the experimental results show that while the proposed algorithm is very fast, significant performance improvements can be achieved compared to the uniform buffer allocation. For instance, for a complex audio/video application, about 80% savings in buffering resources, can be achieved by smart buffer allocation using the proposed algorithm


international conference on computer aided design | 2004

Application-specific buffer space allocation for networks-on-chip router design

Jingcao Hu; Radu Marculescu

We present a system-level buffer planning algorithm that can be used to customize the router design in networks-on-chip (NoCs). More precisely, given the traffic characteristics of the target application and the buffering space budget, our algorithm automatically assigns the buffer depth for each input channel, in different routers across the chip, to match the communication pattern, such that the overall performance is maximized. This is in deep contrast with the uniform assignment of buffering resources (currently used in NoC design) which can significantly degrade the overall system performance. For instance, for a complex audio/video application, about 85% savings in buffering resources can be achieved by smart buffer allocation using our algorithm without any reduction in performance.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2008

Energy- and Performance-Aware Incremental Mapping for Networks on Chip With Multiple Voltage Levels

Chen Ling Chou; Umit Y. Ogras; Radu Marculescu

Achieving effective run-time mapping on multiprocessor systems-on-chip (MPSoCs) is a challenging task, particularly since the arrival order of the target applications is not known a priori. This paper targets real-time applications which are dynamically mapped onto embedded MPSoCs, where communication happens via the Network-on-Chip (NoC) approach, and resources connected to the NoC have multiple voltage levels. We address precisely the energy- and performance-aware incremental mapping problem for NoCs with multiple voltage levels and propose an efficient technique (consisting of region selection and node allocation) to solve it. Moreover, the proposed technique allows for new applications to be added to the system with minimal in- terprocessor communication overhead. Experimental results show that the proposed technique is very fast, and as much as 50% communication energy savings can be achieved compared to using an arbitrary allocation scheme.


asia and south pacific design automation conference | 2003

Towards on-chip fault-tolerant communication

Tudor Dumitras; Sam Kerner; Radu Marculescu

As CMOS technology scales down into the deep-submicron (DSM) domain, devices and interconnects are subject to new types of malfunctions and failures that are harder to predict and avoid with the current system-on-chip (SoC) design methodologies. Relaxing the requirement of 100% correctness in operation drastically reduces the costs of design but, at the same time, requires SoCs be designed with some degree of system-level fault-tolerance. In this paper, we introduce a high-level model of DSM failure patterns and propose a new communication paradigm for SoCs, namely stochastic communication. Specifically, for a generic tile-based architecture, we propose a randomized algorithm which not only separates computation from communication, but also provides the required fault-tolerance to on-chip failures. This new technique is easy and cheap to implement in SoCs that integrate a large number of communicating IP cores.


design automation conference | 2007

Voltage-frequency island partitioning for GALS-based networks-on-chip

Umit Y. Ogras; Radu Marculescu; Puru Choudhary; Diana Marculescu

Due to high levels of integration and complexity, the design of multi-core SoCs has become increasingly challenging. In particular, energy consumption and distributing a single global clock signal throughout a chip have become major design bottlenecks. To deal with these issues, a globally asynchronous, locally synchronous (GALS) design is considered for achieving low power consumption and modular design. Such a design style fits nicely with the concept of voltage-frequency islands (VFIs) which has been recently introduced for achieving fine-grain system-level power management. This paper proposes a design methodology for partitioning an NoC architecture into multiple VFIs and assigning supply and threshold voltage levels to each VFI Simulation results show about 40% savings for a real video application and demonstrate the effectiveness of our approach in reducing the overall system energy consumption. The results and functional correctness are validated using an FPGA prototype for an NoC with multiple VFIs.

Collaboration


Dive into the Radu Marculescu's collaboration.

Researchain Logo
Decentralizing Knowledge