Thilan Ganegedara | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thilan Ganegedara is active.

Explore More

Publication

Featured researches published by Thilan Ganegedara.

high performance switching and routing | 2012

StrideBV: Single chip 400G+ packet classification

Thilan Ganegedara; Viktor K. Prasanna

Hardware firewalls act as the first line of defense in protecting networks against attacks. Packets are organized into flows based on a set of packet header fields and a predefined rule is applied on the packets in each flow to filter malicious network traffic. This is realized using packet classification, which is implemented in secure networking environments where mere best-effort delivery of packets is not adequate. Existing packet classification solutions are highly dependent on the properties (or features) of the ruleset. We present a bit vector based lookup scheme and a parallel hardware architecture that does not rely on ruleset features. A detailed performance analysis of the proposed scheme is given under different configurations. Post place-and-route results of our parallel pipelined architecture on a state-of-the-art Field Programmable Gate Array (FPGA) device shows that for real-life firewall rulesets, the proposed solution achieves 400G+ throughput. To the best of our knowledge, this is the first packet classification engine that achieves 400G+ rate on a single FPGA. Further, on the average we achieve 2.5× power efficiency compared with the state-of-the-art solutions.

international performance, computing, and communications conference | 2010

FRuG: A benchmark for packet forwarding in future networks

Thilan Ganegedara; Weirong Jiang; Viktor K. Prasanna

The ossification of Internet infrastructure and protocols have hindered the advancement of itself. GENI, AKARI and several other similar initiatives are pushing forward to overcome this hindrance. They facilitate researchers with networking platforms dedicated for innovative networking experiments. In these virtualized platforms, researchers can define their own forwarding schemes which can be radically different from the existing solutions. However, neither researchers nor the vendors are endowed with benchmarks to evaluate their new schemes. In this paper we introduce a Flexible Rule Generator, FRuG, an entirely user controlled benchmarking tool for evaluating future packet forwarding algorithms. With FRuG, rule generation does not need to be restricted to a fixed number of fields anymore, which makes it highly generic. It allows the user to select the protocol fields and the distribution of each field, which can either be defined by the user or configured to follow the distribution of an input seed file. The user has the complete control over the structure and the size of the rule table which makes it a powerful benchmark to assess various packet forwarding algorithms and for different types of routers (ex. edge routers, core routers, etc.). FRuG consists of an IPv4 prefix analyzer and generator, MAC address analyzer and generator, and a generic rule generator. We believe that FRuG will be a very useful tool to the networking research community with the paradigm shifts in networking like network virtualization, which takes packet forwarding to a completely new level. FRuG is an opensource tool freely available at http://sites.google.com/site/thilangane/research.

field programmable gate arrays | 2011

Memory-efficient and scalable virtual routers using FPGA

Hoang Le; Thilan Ganegedara; Viktor K. Prasanna

Router virtualization has recently gained much interest in the research community. It allows multiple virtual router instances to run on a common physical router platform. The key metrics in designing network virtual routers are: (1) number of supported virtual router instances, (2) total number of prefixes, and (3) ability to quickly update the virtual table. Limited on-chip memory in FPGA leads to the need for memory-efficient merging algorithms. On the other hand, due to high frequency of combined updates from all the virtual routers, the merging algorithms must be highly efficient. Hence, the router must support quick updates. In this paper, we propose a simple merging algorithm whose performance is not sensitive to the number of routing tables considered. The performance solely depends on the total number of prefixes. We also propose a novel scalable, high-throughput linear pipeline architecture for IP-lookup that supports large virtual routing tables and quick non-blocking update. Using a state-of-the-art Field Programmable Gate Array (FPGA) along with external SRAM, the proposed architecture can support up to 16M IPv4 and 880K IPv6 prefixes. Our implementation shows a sustained through-put of 400 million lookups per second, even when external SRAM is used.

IEEE Transactions on Parallel and Distributed Systems | 2014

A Scalable and Modular Architecture for High-Performance Packet Classification

Thilan Ganegedara; Weirong Jiang; Viktor K. Prasanna

Packet classification is widely used as a core function for various applications in network infrastructure. With increasing demands in throughput, performing wire-speed packet classification has become challenging. Also the performance of todays packet classification solutions depends on the characteristics of rulesets. In this work, we propose a novel modular Bit-Vector (BV) based architecture to perform high-speed packet classification on Field Programmable Gate Array (FPGA). We introduce an algorithm named StrideBV and modularize the BV architecture to achieve better scalability than traditional BV methods. Further, we incorporate range search in our architecture to eliminate ruleset expansion caused by range-to-prefix conversion. The post place-and-route results of our implementation on a state-of-the-art FPGA show that the proposed architecture is able to operate at 100+ Gbps for minimum size packets while supporting large rulesets up to 28 K rules using only the on-chip memory resources. Our solution is ruleset-feature independent , i.e. the above performance can be guaranteed for any ruleset regardless the composition of the ruleset.

ieee international symposium on parallel & distributed processing, workshops and phd forum | 2013

A Comparison of Ruleset Feature Independent Packet Classification Engines on FPGA

Andrea Sanny; Thilan Ganegedara; Viktor K. Prasanna

Packet classification is used in network firewalls to identify and filter threats or unauthorized network access at the application level. This is realized by comparing incoming packet headers against a predefined rule set. Many solutions to packet classification are available, but most of these solutions exploit some features of the rule set in order to minimize the memory footprint of rule set storage. However, when the expected rule set features are not present, feature-reliant solutions may yield poor memory efficiency. In this paper, we focus on two rule set independent packet classification schemes, Ternary Content Addressable Memory (TCAM), a brute force search method, and StrideBV, a bit-vector-based algorithmic solution, to determine which solution is more suited for high performance packet classification. Using rule set sizes ranging from 32 to 2048 (targeted for firewall rule sets), we implement both schemes on a Field-Programmable Gate Array (FPGA) to evaluate their performance. We measure the performance using memory efficiency, resource consumption, throughput and power efficiency metrics for both solutions. The post place-and-route results on a state-of-the-art FPGA reveal that StrideBV has 4.5× and 3.5× higher power efficiency in comparison with TCAM, along with 6× and 4× higher throughput when using distributed RAM and block RAM as memory respectively. TCAM has better memory efficiency, though its improvement over StrideBV varies.

international conference on communications | 2011

Multiroot: Towards Memory-Efficient Router Virtualization

Thilan Ganegedara; Weirong Jiang; Viktor K. Prasanna

Network virtualization has become a powerful scheme to make efficient use of networking hardware. It allows multiple virtual networks to co-exist on the same physical networking substrate. This requires the hardware router to maintain multiple lookup tables. Hence, ultimately the hardware router should be capable of handling packets from different virtual networks. In this paper, we introduce a memory-efficient solution for router virtualization named, Multiroot. We propose this potential scheme for Provider Edge (PE) router virtualization after examining the address space requirement of such networks. Multiroot is a novel merging technique to consolidate all the routing tables to a single merged table. The shared data structure used in our algorithm results in a significant memory usage reduction in the lookup data structure while guaranteeing traffic isolation which is critical in a virtualized environment. This improvement in memory usage results in a very scalable solution for router virtualization in terms of resource usage of the hardware router. Multiroot uses trie data structure and can be implemented on a hardware or a software platform. Experiments show that our solution can achieve up to 5 fold memory usage reduction compared to state-of-the-art techniques present in literature.

field-programmable logic and applications | 2011

Towards On-the-Fly Incremental Updates for Virtualized Routers on FPGA

Thilan Ganegedara; Hoang Le; Viktor K. Prasanna

Recently, router virtualization has gained much interest in networking community. However, hardware support for router virtualization is still in its primitive stages. One of the major problems in a virtualized router is how to support frequent routing table updates efficiently, without interrupting network traffic. In this paper, we propose a Field Programmable Gate Array (FPGA) based architecture for router virtualization that supports on-the-fly updates, while ensuring scalability and throughput requirements. We introduce a distance-based mapping technique named Fill-In to merge multiple virtual routing tables into a single search tree. Node sharing is avoided by using a uniform data structure that results in a scalable solution for router virtualization. The reconfigurability and abundant parallelism of FPGAs make them a desirable hardware platform for high-performance and cost-effective routers. We leverage the features of modern FPGA devices to implement a parallel-linear-pipelined packet processing engine. Our post place-and route results show that the proposed architecture can support uninterrupted network traffic at 150 Gbps for minimum size (40 Byte) packets. The scalability of the architecture is demonstrated for up to 17 real routing tables. Using the proposed update techniques, our architecture handles an update with a single write bubble.

global communications conference | 2013

100+ Gbps IPv6 packet forwarding on multi-core platforms

Thilan Ganegedara; Viktor K. Prasanna

The migration from IPv4 to IPv6 addressing is gradually taking place with the exhaustion of IPv4 address space. This requires the network infrastructure to have the capability to process and route IPv6 packets. However, with the increased complexity of the lookup operation and storage requirements, performing IPv6 lookup at wire-speed is challenging. In this work, we propose a high-performance IPv6 lookup engine solution for multi-core platforms that deliver state-of-the-art line card throughput rates. In order to exploit the parallelism offered on modern multi-core platforms, we propose a routing table partitioning scheme that forms disjoint and balanced partitions, given a IPv6 routing table. These partitions are represented as range trees to perform the lookup operation. Due to the disjoint nature of the proposed partitioning scheme, the individual range trees are able to operate independently, improving the parallelism of the lookup engine. Our experimental results on state-of-the-art multi-core processors show that throughputs of 100+ Gbps can be achieved for 2 million entry IPv6 routing tables using the proposed scheme. Compared with existing literature, the proposed solution achieves 10× higher throughput and is on par with performance delivered by hardware IP lookup engines.

reconfigurable architectures workshop | 2013

A comprehensive performance analysis of virtual routers on FPGA

Thilan Ganegedara; Viktor K. Prasanna

Network virtualization has gained much popularity with the advent of datacenter networking. The hardware aspect of network virtualization, router virtualization, allows network service providers to consolidate network hardware, reducing equipment cost and management overhead. Several approaches have been proposed to achieve router virtualization to support several virtual networks on a single hardware platform. However, their performance has not been analyzed quantitatively to understand the benefits of each approach. In this work, we perform a comprehensive analysis of performance of these approaches on Field Programmable Gate Array (FPGA) with respect to memory consumption, throughput, and power consumption. Generalized versions of virtualization approaches are evaluated based on post place-and-route results on a state-of-the-art FPGA. Grouping of routing tables is proposed as a novel approach to improve scalability (i.e., the number of virtual networks hosted on a single chip) of virtual routers on FPGA with respect to memory requirement. Further, we employ floor-planning techniques to efficiently utilize chip resources and achieve high performance for virtualized, pipelined architectures, resulting in 1.6× speedup on the average compared with the non-floor-planned approach. The results indicate that the proposed solution is able to support 100+ and 50 virtual routers per chip in the near-best and near-worst case scenarios, while operating at 20+ Gbps rates.

field-programmable logic and applications | 2013

A high-performance IPV6 lookup engine on FPGA

Thilan Ganegedara; Viktor K. Prasanna

We present a routing table partitioning based solution for a high-performance IPv6 packet lookup engine on Field Programmable Gate Arrays (FPGAs). Based on the statistics collected from real-life backbone IPv6 routing tables, we propose a partitioning algorithm that creates both disjoint and balanced prefix groups. For each partition a range tree is built to perform IPv6 lookup. These range trees are mapped onto independent pipelines on FPGA such that for a single IPv6 lookup, only one partition is active. This yields high dynamic power efficiency via selective stage memory enabling. The balanced partitioning enables us to exploit the memory layout of the FPGA to align the pipeline with the on-chip memory blocks for enhanced performance and resource usage. Post place-and-route results on a state-of-the-art FPGA platform shows that a throughput of 200+ Gbps can be achieved for a 1 million entry IPv6 routing table.

Explore More