Stephen Neuendorffer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stephen Neuendorffer is active.

Explore More

Publication

Featured researches published by Stephen Neuendorffer.

Proceedings of the IEEE | 2003

Taming heterogeneity - the Ptolemy approach

Johan Eker; Jorn W. Janneck; Edward A. Lee; Jie Liu; Xiaojun Liu; Jozsef Ludvig; Stephen Neuendorffer; Sonia R. Sachs; Yuhong Xiong

Modern embedded computing systems tend to be heterogeneous in the sense of being composed of subsystems with very different characteristics, which communicate and interact in a variety of ways-synchronous or asynchronous, buffered or unbuffered, etc. Obviously, when designing such systems, a modeling language needs to reflect this heterogeneity. Todays modeling environments usually offer a variant of what we call amorphous heterogeneity to address this problem. This paper argues that modeling systems in this manner leads to unexpected and hard-to-analyze interactions between the communication mechanisms and proposes a more structured approach to heterogeneity, called hierarchical heterogeneity, to solve this problem. It proposes a model structure and semantic framework that support this form of heterogeneity, and discusses the issues arising from heterogeneous component interaction and the desire for component reuse. It introduces the notion of domain polymorphism as a way to address these issues.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011

High-Level Synthesis for FPGAs: From Prototyping to Deployment

Jason Cong; Bin Liu; Stephen Neuendorffer; Juanjo Noguera; Kees A. Vissers; Zhiru Zhang

Escalating system-on-chip design complexity is pushing the design community to raise the level of abstraction beyond register transfer level. Despite the unsuccessful adoptions of early generations of commercial high-level synthesis (HLS) systems, we believe that the tipping point for transitioning to HLS msystem-on-chip design complexityethodology is happening now, especially for field-programmable gate array (FPGA) designs. The latest generation of HLS tools has made significant progress in providing wide language coverage and robust compilation technology, platform-based modeling, advancement in core HLS algorithms, and a domain-specific approach. In this paper, we use AutoESLs AutoPilot HLS tool coupled with domain-specific system-level implementation platforms developed by Xilinx as an example to demonstrate the effectiveness of state-of-art C-to-FPGA synthesis solutions targeting multiple application domains. Complex industrial designs targeting Xilinx FPGAs are also presented as case studies, including comparison of HLS solutions versus optimized manual designs. In particular, the experiment on a sphere decoder shows that the HLS solution can achieve an 11-31% reduction in FPGA resource usage with improved design productivity compared to hand-coded design.

Journal of Circuits, Systems, and Computers | 2003

ACTOR-ORIENTED DESIGN OF EMBEDDED HARDWARE AND SOFTWARE SYSTEMS

Edward A. Lee; Stephen Neuendorffer; Michael J. Wirthlin

In this paper, we argue that model-based design and platform-based design are two views of the same thing. A platform is an abstraction layer in the design flow. For example, a core-based architecture and an instruction set architecture are platforms. We focus on the set of designs induced by this abstraction layer. For example, the set of all ASICs based on a particular core-based architecture and the set of all x86 programs are induced sets. Hence, a platform is equivalently a set of designs. Model-based design is about using platforms with useful modeling properties to specify designs, and then synthesizing implementations from these specifications. Hence model-based design is the view from above (more abstract, closer to the problem domain) and platform-based design is the view from below (less abstract, closer to the implementation technology). One way to define a platform is to provide a design language. Any valid expression in the language is an element of the set. A platform provides a set of cons...

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2007

FPGA Pipeline Synthesis Design Exploration Using Module Selection and Resource Sharing

Welson Sun; Michael J. Wirthlin; Stephen Neuendorffer

The primary goal during synthesis of digital signal processing (DSP) circuits is to minimize the hardware area while meeting a minimum throughput constraint. In field-programmable gate array (FPGA) implementations, significant area savings can be achieved by using slower, more area-efficient circuit modules and/or by time-multiplexing faster, larger circuit modules. Unfortunately, manual exploration of this design space is impractical. In this paper, we introduce a design exploration methodology that identifies the lowest cost FPGA pipelined implementation of an untimed synchronous data-flow graph by combined module selection with resource sharing under the context of pipeline scheduling. These techniques are applied together to minimize the area cost of the FPGA implementation while meeting a user-specified minimum throughput constraint. Two different algorithms are introduced for exploring the large design space. We show that even for small DSP algorithms, combining these techniques can offer significant area savings relative to applying any of them alone

international conference on formal methods and models for co design | 2004

Hierarchical reconfiguration of dataflow models

Stephen Neuendorffer; Edward A. Lee

This paper presents a unified approach to analyzing patterns of reconfiguration in dataflow graphs. The approach is based on hierarchical decomposition of the structure and execution of a dataflow model. In general, reconfiguration of any part of the system might occur at any point during the execution of a model. However, arbitrary reconfiguration must often be restricted, given the constraints of particular dataflow models of computation or modeling constructs. For instance, the reconfiguration of parameters that influence dataflow scheduling or soundness of data type checking must be more heavily restricted. The paper first presents an abstract mathematical model that is sufficient to represent the reconfiguration of many types of dataflow graphs. Using this model, a behavioral type theory is developed that bounds the points in the execution of a model when individual parameters can be reconfigured. This theory can be used to efficiently check semantic constraints on reconfiguration, enabling the safe use of parameter reconfiguration at all levels of hierarchy.

international conference on formal methods and models for co design | 2004

Classes and subclasses in actor-oriented design

Edward A. Lee; Stephen Neuendorffer

Actor-oriented languages provide a component composition methodology that emphasizes concurrency. The interfaces to actors are parameters and ports (vs. members and methods in object-oriented languages). Actors interact with one another through their ports via a messaging schema that can follow any of several concurrent semantics (vs. procedure calls, with prevail in OO languages). Domain-specific actor-oriented languages and frameworks are common (e.g. Simulink, LabVIEW, and many others). However, they lack many of the modularity and abstraction mechanisms that programmers have become accustomed to in 00 languages, such as classes, inheritance, interfaces, and polymorphism. This extended abstract shows the form that such mechanisms might take in AO languages. A prototype of these mechanisms realized in Ptolemy II is described.

ACM Transactions in Embedded Computing Systems | 2009

Classes and inheritance in actor-oriented design

Edward A. Lee; Xiaojun Liu; Stephen Neuendorffer

Actor-oriented components emphasize concurrency and temporal semantics and are used for modeling and designing embedded software and hardware. Actors interact with one another through ports via a messaging schema that can follow any of several concurrent semantics. Domain-specific actor-oriented languages and frameworks are common (Simulink, LabVIEW, SystemC, etc.). However, they lack many modularity and abstraction mechanisms that programmers have become accustomed to in object-oriented components, such as classes, inheritance, interfaces, and polymorphism, except as inherited from the host language. This article shows a form that such mechanisms can take in actor-oriented components, gives a formal structure, and describes a prototype implementation. The mechanisms support actor-oriented class definitions, subclassing, inheritance, and overriding. The formal structure imposes structural constraints on a model (mainly the “derivation invariant”) that lead to a policy to govern inheritance. In particular, the structural constraints permit a disciplined form of multiple inheritance with unambiguous inheritance and overriding behavior. The policy is based formally on a generalized ultrametric space with some remarkable properties. In this space, inheritance is favored when actors are “closer” (in the generalized ultrametric), and we show that when inheritance can occur from multiple sources, one source is always unambiguously closer than the other.

field-programmable logic and applications | 2009

Using C-to-gates to program streaming image processing kernels efficiently on FPGAs

Kristof Denolf; Stephen Neuendorffer; Kees A. Vissers

Effectively exploiting the variety of computational and storage resources available in common FPGA architectures for complex applications, such as the real-time implementation of vision algorithms, is often difficult in standard HDL design methodologies. Higher-level design tools can enable a design to more quickly explore a range of different architectures. In this paper we apply algorithmic C-to-FPGA synthesis technology in a structured design approach and demonstrate its added value on two relevant vision processing kernels: optical flow and debayering. The impact of the proposed approach on the design time, the FPGA resource consumption and the throughput is measured.

field programmable gate arrays | 2013

Building zynq® accelerators with Vivado® high level synthesis

Stephen Neuendorffer; Fernando Martinez-Vallina

Engineering complex systems inevitably requires a designer to balance many conflicting design requirements including performance, cost, power, and design time. In many cases, FPGAs enable engineers to balance these design requirements in ways not possible with other technologies like ASICs, ASSPs, GPUs or general purpose processors. This tutorial will focus on two of the newest commercial FPGA-related technologies, High Level Synthesis (HLS) and Programmable Logic integrated tightly with high performance embedded processors. In particular, we will present a detailed introduction to Vivado HLS, which is capable of synthesizing optimized FPGA circuits from algorithmic descriptions in C, C++ and SystemC. We will also present an introduction to the architecture of Zynq devices and show how interesting system architectures can be constructed using High Level Synthesis and the programmable logic portion of these devices.

field programmable gate arrays | 2006

Combining module selection and resource sharing for efficient FPGA pipeline synthesis

Welson Sun; Michael J. Wirthlin; Stephen Neuendorffer

In FPGA designs significant area savings can be achieved by using slower, more area-efficient circuit modules or by time-multiplexing faster circuit modules. Unfortunately, the ability of designers to manually make such trade-offs is limited by the large number of different architectural possibilities. In order to automatically perform these trade-offs, we have developed a synthesis methodology that generates pipelined data-path circuits from a high-level data-flow specification. This methodology is capable of selecting among a variety of circuit implementations for each operation, a synthesis technique often called module selection, and generating control logic to time multiplexing each circuit module, a synthesis technique often called resource sharing. These techniques are applied together to minimize the area cost of the resulting circuit while meeting a user-specified minimum throughput constraint. We show that even for small benchmark circuits, combining these techniques can offer significant area savings relative to applying them alone.

Explore More