Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Qiong Dai is active.

Publication


Featured researches published by Qiong Dai.


international symposium on computers and communications | 2014

A fast regular expression matching engine for NIDS applying prediction scheme

Lei Jiang; Qiong Dai; Qiu Tang; Jianlong Tan; Binxing Fang

Regular expression matching is considered important as it lies at the heart of many networking applications using deep packet inspection (DPI) techniques. For example, modern networking intrusion detection systems (NIDSs) typically accomplish regular expression matching using deterministic finite automata (DFA) algorithm. However, DFA suffers from the high memory consumption for the state blowup problem. Many algorithms have been proposed to compress the DFA memory storage space, meanwhile, they usually pay the price of low matching speed and high memory bandwidth. In this paper, we first propose an effective DFA compression algorithm by exploiting the similarity between DFA states. Then, we apply a next-state prediction strategy and present a fast DFA matching engine. Carefully designing the DFA matching circuit, we keep the prediction success rate by more than 99,5%, thus get a comparable matching speed with original DFA algorithm. On the side of memory consumption, experimental results show that with typical NIDS rule sets, our algorithm compressed the original DFA by more than 99%. Mapping this algorithm on Xilinx Virtex-7 FPGA chip, we get a throughput of more than 200Gbps.


Concurrency and Computation: Practice and Experience | 2017

RICS‐DFA: a space and time‐efficient signature matching algorithm with Reduced Input Character Set

Qiu Tang; Lei Jiang; Qiong Dai; Majing Su; Hongtao Xie; Binxing Fang

Regular expression matching as a core component of deep packet inspection is widely used in various kinds of modern network intrusion detection system, traffic classification system, network monitoring system, and so on. In these systems, regular expressions are typically converted to a deterministic finite automaton (DFA), which takes O(1) to scan each input character. However, DFA generally consumes a large amount of memory. This paper proposes a novel, space‐efficient and time‐efficient DFA presentation, called reduced input character set DFA (RICS‐DFA). A character escaping and replacing scheme is first introduced to decrease the size of DFAs character set and then to reduce DFAs space requirement with a series of optimization techniques. Based on transition rewriting, a RICS‐DFA constructing algorithm with time complexity of O(n) is presented in this paper. For real rule‐sets, RICS‐DFA reduces the memory consumption by 68–92%, compared with the original DFA. Finally, this paper designs a scalable RICS‐DFA matching engine on field‐programmable gate array platform in which the reduced state transition matrix is mapped to on‐chip memories. The throughput of executing deep packet inspection for real rule‐sets can achieve 7–50.5 Gbps. Copyright


Frontiers of Computer Science in China | 2015

Fast approximate matching of binary codes with distinctive bits

Chenggang Clarence Yan; Hongtao Xie; Bing Zhang; Yanping Ma; Qiong Dai; Yizhi Liu

Although the distance between binary codes can be computed fast in Hamming space, linear search is not practical for large scale datasets. Therefore attention has been paid to the efficiency of performing approximate nearest neighbor search, in which hierarchical clustering trees (HCT) are widely used. However, HCT select cluster centers randomly and build indexes with the entire binary code, this degrades search performance. In this paper, we first propose a new clustering algorithm, which chooses cluster centers on the basis of relative distances and uses a more homogeneous partition of the dataset than HCT has to build the hierarchical clustering trees. Then, we present an algorithm to compress binary codes by extracting distinctive bits according to the standard deviation of each bit. Consequently, a new index is proposed using compressed binary codes based on hierarchical decomposition of binary spaces. Experiments conducted on reference datasets and a dataset of one billion binary codes demonstrate the effectiveness and efficiency of our method.


advances in multimedia | 2014

Fast Search of Binary Codes with Distinctive Bits

Yanping Ma; Hongtao Xie; Zhineng Chen; Qiong Dai; Yinfei Huang; Guangrong Ji

Although distance between binary codes can be computed fast in hamming space, linear search is not practical for large scale dataset. Therefore attention has been paid to the efficiency of performing approximate nearest neighbor search, in which Hierarchical Clustering Trees HCT is the state-of-the-art method. However, HCT builds index with the whole binary codes, which degrades search performance. In this paper, we first propose an algorithm to compress binary codes by extracting distinctive bits according to the standard deviation of each bit. Then, a new index is proposed using com-pressed binary codes based on hierarchical decomposition of binary spaces. Experiments conducted on reference datasets and a dataset of one billion binary codes demonstrate the effectiveness and efficiency of our method.


Procedia Computer Science | 2014

A Real-time Updatable FPGA-based Architecture for Fast Regular Expression Matching

Qiu Tang; Lei Jiang; Xin-xing Liu; Qiong Dai

Abstract In recent years, regular expression has been widely used in many network fields, but more and more applications require real-time update FSM and a space reduced DFA due to limited memory capacity. In this paper, we firstly propose a new architecture on FPGA supporting real-time update FSM, and design a special protocol for this update. Secondly, in order to support large-scale and complex semantic regular expression rule sets, we design an improved run-length encoding (iRLE) algorithm based on FPGA to reduce the DFAs storage space. The proposed algorithm gains a good compression ratio and requires only 2 clock cycles per a state transition. The experimental results also show that the new algorithm has both advantages of compressing ratio and speed, and the maximum throughput of the automaton can reach 10.7Gbps.


international symposium on computers and communications | 2017

Acceleration of RSA processes based on hybrid ARM-FPGA cluster

Xu Bai; Lei Jiang; Qiong Dai; Jiajia Yang; Jianlong Tan

Cooperation of software and hardware with hybrid architectures, such as Xilinx Zynq SoC combining ARM CPU and FPGA fabric, is a high-performance and low-power platform for accelerating RSA Algorithm. This paper adopts the none-subtraction Montgomery algorithm and the Chinese Remainder Theorem (CRT) to implement high-speed RSA processors, and deploys a 48-node cluster infrastructure based on Zynq SoC to achieve extremely high scalability and throughput of RSA computing. In this design, we use the ARM to implement node-to-node communication with the Message Passing Interface (MPI) while use the FPGA to handle complex calculation. Finally, the experimental results show that the overall performance is linear with the number of nodes. And the cluster achieves 6×∼9× speedup against a multi-core desktop (Intel i7-3770) and comparable performance to a many-core server (288-core). In addition, we gain up to 2.5× energy efficiency compared to these two traditional platforms.


International Conference on Applications and Techniques in Information Security | 2015

RICS-DFA: Reduced Input Character Set DFA for Memory-Efficient Regular Expression Matching

Qiu Tang; Lei Jiang; Qiong Dai; Majing Su; Hongtao Xie

Regular expression matching as a core component of deep packet inspection (DPI) is widely used in various kinds of modern network intrusion detection system (NIDS), traffic classification system and network monitoring system, etc. In these systems, regular expressions are typically converted to deterministic finite automaton (DFA), and the DFA is used to scan and check each byte of incoming packet’s payload against regular expression rule sets to judge whether current packet is matched by any rule sets. If matched, it means the packet contains specific attacks, viruses, and so on. However, the DFA generally consumes a large amount of memory. Many recent improvement work mainly focus on how to reduce the amount of memory requirement. Like the previous work, in this paper we propose a compact, time-efficient and novel DFA structure to significantly decrease the DFA’s space, the new DFA called Reduced Input Character Set DFA (RICS-DFA). A character escaping and replacing scheme is first introduced to decrease DFA’s character set size and then to reduce DFA’s space requirement with a series of optimization techniques based on RICS-DFA. A RICS-DFA is constructed by transition rewriting. Experimental results on real-life rule-sets reveal that compared to the original DFA, the RICS-DFA reduces the memory consumption by 68 %–92 % while sacrificing trivial matching speed.


International Conference on Applications and Techniques in Information Security | 2014

Sybil-Resist: A New Protocol for Sybil Attack Defense in Social Network

Wei Ma; Sen-Zhe Hu; Qiong Dai; Ting-Ting Wang; Yin-Fei Huang

Currently, most of the existing social networks on Internet are distributed, decentralized systems, and they are particularly vulnerable to Sybil attack in which a single malicious user introduces multiple bogus identities and pretends to be multiple and real users in the network. With these controlled identities, the malicious user can create a Byzantine failure in collaborative tasks by ‘out vote’ the real identities. This paper conducts a survey on the network security of social networks to provide the overview of the current online security of the social networks and the corresponding defend methods. Based on the survey, this paper proposes Sybil-Resist, a Random Walk-based Sybil attack defense protocol devoting to identifying the Sybil nodes and the Sybil region efficiently. The simulation results obtained by a more realistic simulation topology show that the proposed scheme outperforms existing solutions in terms of detection accuracy and running time.


International Conference on Trustworthy Computing and Services | 2012

A Dynamic Strategy to Cache Out-of-Sequence Packet in DPI System

Qingyun Liu; Wenzhong Feng; Qiong Dai

As a major approach for a network security system to discover threats or forensics, DPI (Deep Packet Inspection) technique is widely used in monitoring network flow. With the rapid development of Internet bandwidth, DPI system is facing more and more challenges on performance. One of these challenges is that out-of-sequence packets in TCP transmission will greatly affect memory consumption and data-recall. For a large scale DPI system, each DPI node has to monitor a huge amount of TCP session. It will consume too many resources to allocate plenty of space for storing all out-of-sequence packets. Meanwhile, insufficient space for buffer results in dropping packets and thus unable to reassemble network flow. We analyze the out-of-sequence characteristic of different Internet flow, and implement a dynamic strategy to cache out-of-sequence packet, which provide a more flexible way to keep track of the sessions. Experiment shows that based on the new strategy, a DPI system can greatly improve the completeness of data recall with little extra consumption of space.


Archive | 2018

High Performance Regular Expression Matching on FPGA

Jiajia Yang; Lei Jiang; Xu Bai; Qiong Dai

Deep Packet Inspection (DPI) technology has been widely deployed in Network Intrusion Detection System (NIDS) to detect attacks and viruses. State-of-the-art NIDS uses Deterministic Finite Automata (DFA) to perform regular expression matching for its stable matching speed. However, traditional DFA algorithm’s throughput is limited by the input character’s width (usually one character per time). In this paper, we present an architecture named Parallel-DFA to accelerate regular expression matching by scanning multiple characters per time. Experimental results show that, our architecture can achieve as high as 1200 Gbps (1.17 Tbps) rate on current single Field-Programmable Gate Array (FPGA) chip. This makes it a very practical solution for NIDS in 100G Ethernet standard network, which is currently the fastest approved standard of Ethernet. To the best of our knowledge, this is the fastest matching performance architecture on a single FPGA chip. Besides, the throughput is nearly 3 orders of magnitude (916\(\times \)) than that of original DFA implemented on software. Our architecture is about 183.2\(\times \) efficiency than that of original DFA.

Collaboration


Dive into the Qiong Dai's collaboration.

Top Co-Authors

Avatar

Lei Jiang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Qiu Tang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Hongtao Xie

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Jiajia Yang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Majing Su

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Jianlong Tan

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Xu Bai

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Binxing Fang

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Wei Ma

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Yanping Ma

Ocean University of China

View shared research outputs
Researchain Logo
Decentralizing Knowledge