Is this you? Create Your Porfile

Ning Weng

Southern Illinois University Carbondale

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ning Weng is active.

Explore More

Publication

Featured researches published by Ning Weng.

global communications conference | 2004

Characterizing network processing delay

Ramaswamy Ramaswamy; Ning Weng; Tilman Wolf

Computer networks have progressed from a simple store-and-forward medium to a complex communication infrastructure. Routers in the network need to implement a variety of functions ranging from simple packet classification for forwarding and firewalling to complex payload modifications for encryption and content adaptation. As these functions increase in number and complexity, more processing time is required, and packets experience a significant processing delay. In most network simulations, this delay has not been addressed because it was considered negligible. However, we show that this network processing delay can reach the magnitude of long-distance propagation delay and thus becomes a significant contributor to the overall packet delay. We evaluate different network applications and develop a model that characterizes packet processing cost with only a few parameters that can easily be derived from our simulations. To validate our simulation and our model, we compare them to actual network measurements. The contributions of this work can be used to increase the accuracy of network simulations and improve network performance estimations.

international symposium on performance analysis of systems and software | 2005

Analysis of Network Processing Workloads

Ramaswamy Ramaswamy; Ning Weng; Tilman Wolf

Network processing is becoming an increasingly important paradigm as the Internet moves towards an architecture with more complex functionality inside the network. Modern routers not only forward packets, but also process headers and payloads to implement a variety of functions related to security, performance, and customization. It is important to get a detailed understanding of the workloads associated with this processing in order to be able to develop efficient network processing engines. We present a tool called PacketBench, which provides a framework for implementing network processing applications and obtaining an extensive set of workload characteristics. PacketBench provides the support functions to handle various packet traces and manage packet memory. For statistics collection, PacketBench provides the ability to derive a number of microarchitectural and networking related metrics. The understanding of workload details of network processing has many practical applications. As network processing systems move towards highly parallel embedded systems, it is becoming increasingly important to explore the processing requirements of individual packets rather than averaged statistics. We show a range of workload results that focus on individual packets and the variation between them

acm symposium on applied computing | 2005

Profiling and mapping of parallel workloads on network processors

Ning Weng; Tilman Wolf

Network processors are embedded system-on-a-chip multiprocessors that are optimized to perform simple packet processing tasks at data rates of several Gigabits per second. To meet the performance demands of increasing link speeds and more complex network applications, network processors are implemented with several dozens of processor cores and execute multiple packet processing applications in parallel. The complexity of such systems makes it increasingly difficult for application developers to map applications to the various system resources and achieve optimal performance. We propose an automated profiling and mapping methodology for these highly parallel, embedded systems that starts out with a simple uniprocessor implementation of the networking application. An architecture independent representation of the runtime behavior of the application is used to map and schedule different processing steps to the underlying hardware. An analytic performance model is used in the process to estimate system performance and to find an near-optimal solution through iteration.

IEEE Network | 2007

Runtime Support for Multicore Packet Processing Systems

Tilman Wolf; Ning Weng

Network processors promise a flexible, programmable packet processing infrastructure for network systems. To make full use of the capabilities of network processors, it is imperative to provide the ability to dynamically adapt to changing traffic patterns in the form of a network processor runtime system. The differences from existing operating systems and the main challenges lie in the multiprocessor nature of NPs, their on-chip resource constraints, and real-time processing requirements. In this article we explore the key design trade-offs that need to be considered when designing a network processor operating system. In particular, we explore the performance impact of application analysis on partitioning, traffic characterization, workload mapping, and runtime adaptation. We present and discuss qualitative and quantitative results in the context of a particular application analysis and mapping framework. The observations and conclusions are generally applicable to any runtime environment for network processors.

Network Processor Design#R##N#Issues and Practices Volume 3 | 2005

Application analysis and resource mapping for heterogeneous network processor architectures

Ramaswamy Ramaswamy; Ning Weng; Tilman Wolf

In this chapter, an annotated, directed, acyclic graph is introduced to represent application characteristics and dependencies in architecture independent fashion. A methodology is developed to automatically derive this annotated directed acyclic graph (ADAG) from run-time instruction traces that can be obtained easily from simulations. To consider the natural clustering of instructions within an application, maximum local ratio cut (MLRC) is used to group instruction blocks and reduce the overall ADAG size. For four network processing applications, such ADAGs are presented and how the inherent parallelism (multiprocessing or pipelining) can be observed is shown. Using the ADAG representation, processing steps can be allocated to processing resources using a heuristic that uses node criticality as a metric. This is an important step towards automatically analyzing applications and mapping processing tasks to heterogeneous network processor architectures. Finally, it is necessary to develop a robust methodology for automatically identifying processing blocks for coprocessors and hardware accelerators.

architectures for networking and communications systems | 2005

Design considerations for network processor operating systems

Tilman Wolf; Ning Weng; Chia-Hui Tai

Network processors (NPs) promise a flexible, programmable packet processing infrastructure for network systems. To make full use of the capabilities of network processors, it is imperative to provide the ability to dynamically adapt to changing traffic patterns and to provide run-time support in the form of a network processor operating system. The differences to existing operating systems and the main challenges lie in the multiprocessor nature of NPs, their on-chip resources constraints, and the real-time processing requirements. In this paper, we explore the key design tradeoffs that need to be considered when designing a network processor operating system. In particular, we explore the performance impact of (1) application analysis for partitioning, (2) network traffic characterization, (3) workload mapping, and (4) run-time adaptation. We present and discuss qualitative and quantitative results in the context of a particular application analysis and mapping framework, but the observations and conclusions are generally applicable to any run-time environment for network processors.

Journal of Systems Architecture | 2009

Analysis of network processing workloads

Ramaswamy Ramaswamy; Ning Weng; Tilman Wolf

ACM Transactions in Embedded Computing Systems | 2009

Analytic modeling of network processors for parallel workload mapping

Ning Weng; Tilman Wolf

Network processors are heterogeneous system-on-chip multiprocessors that are optimized to perform packet forwarding and processing tasks at Gigabit data rates. To meet the performance demands of increasing link speeds and complex network applications, network processors are implemented with several dozen embedded processor cores and hardware accelerators that run multiple packet processing applications in parallel. The parallel nature of the processing system makes it increasingly difficult for application developers to understand and manage resources and map processing tasks to the hardware. To address this problem, we present a methodology for profiling and analyzing network processor applications, mapping processing tasks to a generalized network processor architecture, and analytically determining the expected throughput performance. The key novelty of this work is not only the adaptation of application analysis and mapping algorithms to heterogeneous network processors, but also that the entire process can be automated and hidden from the application developer. Starting with the analysis of a uniprocessor implementation of the application, the process yields a mapping of the partitioned application that shows best performance for a given network processor system. The simplicity of the proposed randomized mapping algorithm allows the use of this methodology in network processor runtime systems where dynamic reallocation of tasks is necessary but processing power is limited. We present results that show the effectiveness of the analysis and mapping methodology as well as its application to design space exploration.

acm special interest group on data communication | 2003

Considering processing cost in network simulations

Ramaswamy Ramaswamy; Ning Weng; Tilman Wolf

In many network simulations and models the cost of processing a packet is considered negligible or overly simplified. The functionality of routers is steadily increasing and complex processing of packet payloads is being implemented (deep packet classification, encryption, content transcoding). We show two examples where processing cost can contribute to a significant portion of the overall packet delay. To enable a more precise consideration of processing delay, we present a tool called NPEST (Network Processing Estimator). NPEST is a framework on top of which packet processing functionality can be implemented and simulated using an actual processor simulator. NPEST can be programmed in C and greatly simplifies the implementation and simulation process as compared to using network processor simulators. The results derived from NPEST can either be used directly or be aggregated to processing statistics for network simulations. We present such results for two prototype applications: IP forwarding and IP security. We also show a comparison between the results obtained from NPEST and an Intel IXP1200 network processor.

Journal of Network and Computer Applications | 2011

Information quality model and optimization for 802.15.4-based wireless sensor networks

Ning Weng; I-Hung Li; Lucas Vespa

High information quality is a paramount requirement for wireless sensor network (WSN) monitoring applications. However, it is challenging to achieve a cost effective information quality solution due to unpredictable environment noise and events, unreliable wireless channel and network bandwidth, and sensor resource and energy constraints. Specifically, the dynamic and unreliable nature of WSNs make it difficult to pre-determine optimum sensor rates and predict packet loss. To address this problem, we present an information quality metric which characterizes information quality based on the sampling frequency of sensor nodes and the packet loss rate during network transmission. Our fundamental quality metric is based on signal-to-noise ratio and is therefore application independent. Based on our metric, a quality-aware scheduling system (QSS) is developed, which exploits cross-layer control of sensor nodes to effectively schedule data sensing and forwarding. Particularly, we develop and evaluate several QSS scheduling mechanisms: passive, reactive and perceptive. These mechanisms can adapt to environment noise, bandwidth variation and wireless channel collisions by dynamically controlling sensor rates and phase. Our experimental results indicate that our QSS is a novel and effective approach to improve information quality for WSNs.

Explore More