Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Fei He is active.

Publication


Featured researches published by Fei He.


architectures for networking and communications systems | 2007

Towards high-performance flow-level packet processing on multi-core network processors

Yaxuan Qi; Bo Xu; Fei He; Baohua Yang; Jianming Yu; Jun Li

There is a growing interest in designing high-performance network devices to perform packet processing at flow level. Applications such as stateful access control, deep inspection and flow-based load balancing all require efficient flow-level packet processing. In this paper, we present a design of high-performance flow-level packet processing system based on multi-core network processors. Main contribution of this paper includes: a) A high performance flow classification algorithm optimized for network processors; b) An efficient flow state management scheme leveraging memory hierarchy to support large number of concurrent flows; c) Two hardware-optimized order-preserving strategies that preserve internal and external per-flow packet order. Experimental results show that: a) The proposed flow classification algorithm, AggreCuts, outperforms the well-known HiCuts algorithm in terms of classification rate and memory usage; b) The presented SigHash scheme can manage over 10M concurrent flow states on the Intel IXP2850 NP with extremely low collision rate; c) The performance of internal packet order-preserving scheme using SRAM queue-array is about 70% of that of external packet order-preserving scheme realized by ordered-thread execution.


international conference on parallel processing | 2007

Towards Optimized Packet Classification Algorithms for Multi-Core Network Processors

Yaxuan Qi; Bo Xu; Fei He; Xin Zhou; Jianming Yu; Jun Li

In this paper, a novel packet classification scheme optimized for multi-core network processors is proposed. The algorithm, Explicit Cuttings (ExpCuts), adopts a hierarchical space aggregation technique to significantly reduce the memory usage. Consequently, without burst of memory usages, the time-consuming linear search in the conventional decision-tree based packet classification algorithms is eliminated, and an explicit worst-case search time is achieved. To evaluate the performance of ExpCuts, we implement the algorithm, as well as HiCuts and HSM, on the Intel IXP2850 network processor. Experimental results show that ExpCuts outperforms the existing best- known algorithms in terms of memory usage and classification speed.


Computer Communications | 2011

OpenGate: Towards an open network services gateway

Yaxuan Qi; Fei He; Xiang Wang; Xinming Chen; Yibo Xue; Jun Li

In this paper, we propose an extensible open network services gateway (OpenGate) for high-performance network processing at the edge of high-speed networks. The OpenGate system embraces recent advances of open network technologies: the performance is guaranteed by using open-standard ATCA platforms; and the extensibility is achieved by employing parallelized open source software. As an application example of OpenGate, a high-performance security gateway, OpenGate-SG, was developed using existing ATCA platforms and open source software. This system provides multiple security services, including stateful firewall, intrusion prevention and anti-virus. Experimental results show that, OpenGate-SG can achieve up to 200Gbps stateful firewall throughput with 8Gbps intrusion prevention and anti-virus, which is competitive to the performance of todays high-end security products. OpenGate-SG has also been tested as a security gateway for a university campus network with more than 1000 students.


architectures for networking and communications systems | 2008

Towards effective network algorithms on multi-core network processors

Yaxuan Qi; Zongwei Zhou; Baohua Yang; Fei He; Yibo Xue; Jun Li

To build high-performance network devices with holistic security protection, a large number of algorithms have been proposed. However, multi-core implementation of the existing algorithms suffers from three limitations: performance instability, data-structure heterogeneity, and hardware dependency. In this paper, we propose three principles for effective network processing on multi-core network processors. To verify the effectiveness of these principles, algorithms for two typical network processing tasks are redesigned and implemented on the Cavium Octeon3860 network processor. Test results show that our schemes achieve superior performance in comparison with existing best-known algorithms.


Tsinghua Science & Technology | 2011

Accelerating Application Identification with Two-Stage Matching and Pre-Classification *

Fei He; Fan Xiang; Yiyang Shao; Yibo Xue; Jun Li

Modern datacenter and enterprise networks require application identification to enable granular traffic control that either improves data transfer rates or ensures network security. Providing application visibility as a core network function is challenging due to its performance requirements, including high throughput, low memory usage, and high identification accuracy. This paper presents a payload-based application identification method using a signature matching engine utilizing characteristics of the application identification. The solution uses two-stage matching and pre-classification to simultaneously improve the throughput and reduce the memory. Compared to a state-of-the-art common regular expression engine, this matching engine achieves 38% memory use reduction and triples the throughput. In addition, the solution is orthogonal to most existing optimization techniques for regular expression matching, which means it can be leveraged to further increase the performance of other matching algorithms.


global communications conference | 2010

YACA: Yet Another Cluster-Based Architecture for Network Intrusion Prevention

Fei He; Yaxuan Qi; Yibo Xue; Jun Li

Inline stateful and deep inspection for network intrusion prevention system (NIPS) is progressively challenging to cope with the fast growing volume and ever increasing complexity of network traffic. Traditional cluster-based architectures provide a solution for scalable and high performance NIPS, but with some common limitations. This paper proposed yet another cluster-based architecture (YACA) with a stateful traffic splitter. As an architectural approach for building a high performance NIPS, we present a novel design of stateful traffic splitter. The performance of its network processor implemented prototype demonstrates that such a design is suitable for the proposed architecture.


international conference on computer communications | 2009

Discrete Bit Selection: Towards a Bit-Level Heuristic Framework for Multi-Dimensional Packet Classification

Baohua Yang; Yaxuan Qi; Fei He; Yibo Xue; Jun Li

Packet classification is still a challenging problem in practice under large number of classification rules and constant growth of performance requirement. Most of the existing algorithms try to solve the problem heuristically by leveraging on the inherent field-level characteristics of the rules. This paper proposes a bit-level heuristic framework: Discrete Bit Selection (DBS) for multi-dimensional packet classification. Preliminary experimental results show that DBS-based algorithm gains much better performance both in search time and memory requirement than the well-known field-level algorithms with various real-life rule sets.


architectures for networking and communications systems | 2009

SANS: a scalable architecture for network intrusion prevention with stateful frontend

Fei He; Yaxuan Qi; Yibo Xue; Jun Li

Inline stateful and deep inspection for intrusion prevention is becoming more challenging due to the increase in both the volume of network traffic and the complexity of the analysis requirements. In this work, we pursue a novel architectural approach, named SANS, which takes both the advantage of new generation network processors for packet-header-based processing and the advantage of commodity x86 platforms for packet payload data processing. A session table scheme is designed for the stateful frontend in SANS to achieve wire speed inline processing.


international conference on distributed computing systems | 2008

Fast Path Session Creation on Network Processors

Bo Xu; Yaxuan Qi; Fei He; Zongwei Zhou; Yibo Xue; Jun Li

The security gateways today are required not only to block unauthorized accesses by authenticating packet headers, but also by inspecting connection states to defend against malicious intrusions. Hence session creation rate plays a key role in determining the overall performance of stateful intrusion prevention systems. In this paper, we propose a high-speed session creation scheme optimized for network processors. Main contribution includes: a) A high-performance flow classification algorithm on network processors; b) An efficient TCP three-way handshake scheme designed for fast-path processing using a two-stage intelligent hashing. Experimental results show that: a) The presented parallel optimized flow classification algorithm, Parallel Search Cross-Producting, outperforms the original Cross-Producting and Binary Search Cross-Producting algorithms with 300% and 60% increase of classification speed; b) The proposed fast path three-way handshake scheme, IntelliHash, achieves a TCP connection creation rate over 2M connections per second.


architectures for networking and communications systems | 2009

OASis: towards extensible open-architecture services platforms

Yaxuan Qi; Fei He; Xiang Wang; Xinming Chen; Yibo Xue; Jun Li

In this paper, we propose an extensible Open-Architecture Services platform (OASis) for high-performance network processing. OASis embraces recent advances of open technologies, including open source software, open system standards and open network architectures. Three programming models are proposed for target-specific processing modules: a multi-granularity packet processing model for network processing; a thread-isolated parallel programming model for service processing; and a message-based management model for centralized system administration. As an application example of OASis, a Unified Threat Management (UTM) prototype is implemented. This prototype provides multiple network security services, including stateful firewall, intrusion detection, and virus scanning. Experimental results show that, the OASis-UTM prototype can achieve 40Gbps stateful firewall performance together with 4--8Gbps intrusion detection and anti-virus performance on a 12U 14-slot ATCA platform.

Collaboration


Dive into the Fei He's collaboration.

Top Co-Authors

Avatar

Jun Li

Tsinghua University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bo Xu

Tsinghua University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xiang Wang

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge