S. Murali
Stanford University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by S. Murali.
design, automation, and test in europe | 2004
S. Murali; G. De Micheli
We address the design of complex monolithic systems, where processing cores generate and consume a varying and large amount of data, thus bringing the communication links to the edge of congestion. Typical applications are in the area of multi-media processing. We consider a mesh-based networks on chip (NoC) architecture, and we explore the assignment of cores to mesh cross-points so that the traffic on links satisfies bandwidth constraints. A single-path deterministic routing between the cores places high bandwidth demands on the links. The bandwidth requirements can be significantly reduced by splitting the traffic between the cores across multiple paths. In this paper, we present NMAP, a fast algorithm that maps the cores onto a mesh NoC architecture under bandwidth constraints, minimizing the average communication delay. The NMAP algorithm is presented for both single minimum-path routing and split-traffic routing. The algorithm is applied to a benchmark DSP design and the resulting NoC is built and simulated at cycle accurate level in SystemC using macros from the /spl times/pipes library. Also, experiments with six video processing applications show significant savings in bandwidth and communication cost for NMAP algorithm when compared to existing algorithms.
design automation conference | 2004
S. Murali; G. De Micheli
Increasing communication demands of processor and memory cores in Systems on Chips (SoCs) necessitate the use of Networks on Chip (NoC) to interconnect the cores. An important phase in the design of NoCs is he mapping of cores onto the most suitable opology for a given application. In this paper, we present SUNMAP a tool for automatically selecting he best topology for a given application and producing a mapping of cores onto that topology. SUNMAP explores various design objectives such as minimizing average communication delay, area, power dissipation subject to bandwidth and area constraints. The tool supports different routing functions (dimension ordered, minimum-path, traffic splitting) and uses floorplanning information early in the topology selection process to provide feasible mappings. The network components of the chosen NoC are automatically generated using cycle-accurate SystemC soft macros from X-pipes architecture. SUNMAP automates NoC selection and generation, bridging an important design gap in building NoCs. Several experimental case studies are presented in the paper, which show the rich design space exploration capabilities of SUNMAP.
IEEE Design & Test of Computers | 2005
S. Murali; Theo Theocharides; Narayanan Vijaykrishnan; Mary Jane Irwin; Luca Benini; G. De Micheli
In this article, we discuss design constraints to characterize efficient error recovery mechanisms for the NoC design environment. We explore error control mechanisms at the data link and network layers and present the schemes architectural details. We investigate the energy efficiency, error protection efficiency, and performance impact of various error recovery mechanisms.
design, automation, and test in europe | 2004
Antoine Jalabert; S. Murali; Luca Benini; Giovanni De Micheli
Future Systems on Chips (SoCs) will integrate a large number of processor and storage cores onto a single chip and require Networks on Chip (NoC) to support the heavy communication demands of the system. The individual components of the SoCs will be heterogeneous in nature with widely varying functionality and communication requirements. The communication infrastructure should optimally match communication patterns among these components accounting for the individual component needs. In this paper we present xpipes Compiler, a tool for automatically instantiating an application-specific NoC for heterogeneous Multi-Processor SoCs. The xpipes Compiler instantiates a network of building blocks from a library of composable soft macros (switches, network interfaces and links) described in SystemC at the cycle-accurate level. The network components are optimized for that particular network and support reliable, latency-insensitive operation. Example systems with application-specific NoCs built using the xpipes Compiler show large savings in area (factor of 6.5), power (factor of 2.4) and latency (factor of 1.42) when compared to a general-purpose mesh-based NoC architecture.
international conference on computer aided design | 2006
S. Murali; Paolo Meloni; Federico Angiolini; David Atienza; Salvatore Carta; Luca Benini; G. De Micheli; Luigi Raffo
With increasing communication demands of processor and memory cores in systems on chips (SoCs), scalable networks on chips (NoCs) are needed to interconnect the cores. For the use of NoCs to be feasible in todays industrial designs, a custom-tailored, application-specific NoC that satisfies the design objectives and constraints of the targeted application domain is required. In this work, we present a design methodology that automates the synthesis of such application-specific NoC architectures. We present a floorplan aware design method that considers the wiring complexity of the NoC during the topology synthesis process. This leads to detecting timing violations on the NoC links early in the design cycle and to have accurate power estimations of the interconnect. We incorporate mechanisms to prevent deadlocks during routing, which is critical for proper operation of NoCs. We integrate the NoC synthesis method with an existing design flow, automating NoC synthesis, generation, simulation and physical design processes. We also present ways to ensure design convergence across the levels. Experiments on several SoC benchmarks are presented, which show that the synthesized topologies provide a large reduction in network power consumption (2.78 times on average) and improvement in performance (1.59 times on average) over the best mesh and mesh-based custom topologies. An actual layout of a multimedia SoC with the NoC designed using our methodology is presented, which shows that the designed NoC supports the required frequency of operation (close to 900 MHz) without any timing violations. We could design the NoC from input specifications to layout in 4 hours, a process that usually takes several weeks
design, automation, and test in europe | 2006
S. Murali; Martijn Coenen; Andrei Radulescu; Kgw Kees Goossens; G. De Micheli
A communication-centric design approach, networks on chips (NoCs), has emerged as the design paradigm for designing a scalable communication infrastructure for future systems on chips (SoCs). As technology advances, the number of applications or use-cases integrated on a single chip increases rapidly. The different use-cases of the SoC have different communication requirements (such as different bandwidth, latency constraints) and traffic patterns. The underlying NoC architecture has to satisfy the constraints of all the use-cases. In this work, we present a methodology to map multiple use-cases onto the NoC architecture, satisfying the constraints of each use-case. We present dynamic re-configuration mechanisms that match the NoC configuration to the communication characteristics of each use-case, also accounting for use-cases that can run in parallel. The methodology is applied to several real and synthetic SoC benchmarks, which result in a large reduction in NoC area (an average of 80%) and power consumption (an average of 54%) compared to traditional design approaches
asia and south pacific design automation conference | 2005
S. Murali; Luca Benini; G. De Micheli
Networks on chips (NoCs) have evolved as the communication design paradigm of future systems on chips (SoCs). In this work we target the NoC design of complex SoCs with heterogeneous processor/memory cores, providing quality-of-service (QoS) for the application. We present an integrated approach to mapping of cores onto NoC topologies and physical planning of NoCs, where the position and size of the cores and network components are computed. Our design methodology automates NoC mapping, physical planning, topology selection, topology optimization and instantiation, bridging an important design gap in building application specific NoCs. We also present a methodology to guarantee QoS for the application during the mapping-physical planning process by satisfying the delay/jitter constraints and real-time constraints of the traffic streams. Experimental studies show large area savings (up to 2/spl times/), bandwidth savings (up to 5/spl times/) and network component savings (up to 2.2/spl times/ in buffer count, 3.8/spl times/ in number of wires, 1.6/spl times/ in switch ports) compared to traditional design approaches.
asia and south pacific design automation conference | 2006
S. Murali; Martijn Coenen; Andrei Radulescu; Kgw Kees Goossens; G. De Micheli
To provide a scalable communication infrastructure for systems on chips (SoCs), networks on chips (NoCs), a communication centric design paradigm is needed. To be cost effective, SoCs are often programmable and integrate several different applications or use-cases on to the same chip. For the SoC platform to support the different use-cases, the NoC architecture should satisfy the performance constraints of each individual use-case. In this work we motivate the need to consider multiple use-cases during the NoC design process. We present a method to efficiently map the applications on to the NoC architecture, satisfying the design constraints of each individual use-case. We also present novel ways to dynamically reconfigure the network across the different use-cases and explore the possibility of integrating dynamic voltage and frequency scaling (DVS/DFS) techniques with the use-case centric NoC design methodology. We validate the performance of the design methodology on several SoC applications. The dynamic reconfiguration of the NoC integrated with DVS/DFS schemes results in large power savings for the resulting NoC systems
design, automation, and test in europe | 2005
S. Murali; G. De Micheli
As the communication requirements of current and future Multiprocessor Systems on Chips (MPSoCs) continue to increase, scalable communication architectures are needed to support the heavy communication demands of the system. This is reflected in the recent trend that many of the standard bus products such as STbus, have now introduced the capability of designing a crossbar with multiple buses operating in parallel. The crossbar configuration should be designed to closely match the application traffic characteristics and performance requirements. In this work we address this issue of application-specific design of optimal crossbar (using STbus crossbar architecture), satisfying the performance requirements of the application and optimal binding of cores onto the crossbar resources. We present a simulation based design approach that is based on analysis of actual traffic trace of the application, considering local variations in traffic rates, temporal overlap among traffic streams and criticality of traffic streams. Our methodology is applied to several MPSoC designs and the resulting crossbar platforms are validated for performance by cycle-accurate SystemC simulation of the designs. The experimental case studies show large reduction in packet latencies (up to 7×) and large crossbar component savings (up to 3.5×) compared to traditional design approaches.
international conference on hardware/software codesign and system synthesis | 2006
Martijn Coenen; S. Murali; Andrei Ruadulescu; Kees Goossens; Giovanni De Micheli
When designing a system-on-chip (SoC) using a network- on-chip (NoC), silicon area and power consumption are two key elements to optimize. A dominant part of the NoC area and power consumption is due to the buffers in the network interfaces (NIs) needed to decouple computation from communication. Having such a decoupling prevents stalling of IP blocks due to the communication interconnect. The size of these buffers is especially important in real-time systems, as there they should be big enough to obtain predictable performance. To ensure that buffers do not overflow, end- to-end flow-control is needed. One form of end-to-end flow- control used in NoCs is credit-based flow-control. This form places additional requirements on the buffer sizes, because the flow-control delays need to be taken into account. In this work, we present an algorithm to find the minimal decoupling buffer sizes for a NoC using TDMA and credit- based end-to-end flow-control, subject to the performance constraints of the applications running on the SoC. Our experiments show that our method results in a 84% reduction of the total NoC buffer area when compared to the state-of- the art buffer-sizing methods. Moreover, our method has a low run-time complexity, producing results in the order of minutes for our experiments, enabling quick design cycles for large SoC designs. Finally, our method can take into account multiple usecases running on the same SoC.