Frank T. Hady
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Frank T. Hady.
IEEE Network | 2003
Frank T. Hady; Tony Bock; Mason B. Cabot; Jim Chu; Jeff Meinecke; Ken Oliver; Wes Talarek
Even in the face of increasing network bandwidth, there is a desire among service providers to improve network security, availability, and performance. These improvements require increasingly complex computations on network packets. Current networking platforms cannot keep up, leading to less than desired throughput or functionality. Network processors deliver high networking throughput, but not the complex processing capabilities required. High-performance general-purpose processors deliver the complex processing needed, but not the network throughput. Combination platforms that include high-performance general-purpose CPUs and network processors hold the promise of greatly increasing platform performance, enabling desired edge application improvements. This article presents Twin Cities, a heterogeneous multiprocessor research platform we have constructed from a standard IXP1240 platform, a high-volume Intel/spl reg/ Pentium/spl reg/ III processor platform, and custom hardware. This platform provides a high-performance path (high throughput, low latency) between the two processors and presents a shared memory model to the programmer. We motivate and describe the Twin Cities platform, discuss the applications it targets, and present performance measurements.
architectures for networking and communications systems | 2005
Kristen Accardi; Tony Bock; Frank T. Hady; Jon Krueger
Network firewalls occupy a central role in computer security, protecting data, compute, and networking resources while still allowing useful packets to flow. Increases in both the work per network packet and packet rate make it increasingly difficult for general-purpose processor based firewalls to maintain line rate. In a bid to address these evolving requirements we have prototyped a hybrid firewall, using a simple firewall running on a network processor to accelerate a Linux* Netfilter Firewall executing on a general purpose processor. The simple firewall on the network processor provides high rate packet processing for all the packets while the general-purpose processor delivers high rate, full featured firewall processing for those packets that need it. This paper describes the hybrid firewall prototype with a focus on the software created to accelerate Netfilter with a network processor resident firewall. Measurements show our hybrid firewall able to maintain close to 2 Gb/sec line rate for all packet sizes, a significant improvement over the original firewall. We also include the hard won lessons learned while implementing the hybrid firewall.
IEEE Transactions on Computers | 1995
Frank T. Hady; Bernard Menezes
Wormhole routing is an attractive routing technique offering low latency communication without the need to buffer an entire packet in a single node. A new queueing-theoretic model for obtaining throughput and latency of binary hypercubes supporting wormhole routing is developed here. The model is very accurate in predicting the performance of an actual multicomputer over a range of network sizes, packet lengths, and input port priority mappings. Utilizing the model, the performance of networks with identical topologies but different node architectures is estimated. >
Proceedings of the IEEE | 2017
Frank T. Hady; Annie P. Foong; Bryan E. Veal; Dan J. Williams
With a combination of high performance and nonvolatility, the arrival of 3D XPoint memory promises to fundamentally change the memory-storage hierarchy at the hardware, system software, and application levels. This memory will be deployed first as a block addressable storage device, known as the Intel Optane SSD, and even in this familiar form it will drive basic system change. Access times consistently as fast, or faster, than the rest of the system will blur the line between storage and memory. The low latencies from these solid-state drives (SSDs) allow rethinking even basic storage methodologies to be more memory-like. For example, the manner in which storage performance is measured shifts from input–output operations (IOs) at a given queue depth to response time for a given load, like memory is typically measured. System changes to match the low latency of these SSDs are already advanced, and in many cases they enable the application to utilize the SSD’s performance. In other cases, additional work is required, particularly on policies set originally with slow storage in mind. On top of these already-capable systems are real applications. System-level tests show that applications such as key–value stores and real-time analytics can benefit immediately. These application benefits include significantly faster runtime (up to
international memory workshop | 2016
Annie Foong; Frank T. Hady
3\times
Network Processor Design | 2003
Prashant R. Chandra; Frank T. Hady; Raj Yauatkar; Tony Bock; Mason B. Cabot; Philip P. Mathew
) and access to larger data sets than supported in DRAM. Newly viable mechanisms for expanding application memory footprint include native application support or native operating system paging, a significant change in the use of SSDs. The next step in this convergence is 3D XPoint memory accessed through processor load/store operations. Significant operating system support is already in place. The implications of consistently low latency storage and fast persistent memory on computing are great, with applications and systems taking advantage of this new technology as storage as the first to benefit.
file and storage technologies | 2012
Jisoo Yang; Dave B. Minturn; Frank T. Hady
Ultra-low latency, high endurance SSDs are poised to enter the market, based on 3D XPoint™ memory. Here we show that for these new SSDs and modern platforms, storage latency is equally divided between the SSD and the rest-of platform. We summarize some of the recent system level optimizations that make this possible. Such low latency storage offers a potential for applications to use storage as a resource in place of memory. We describe a few examples of use case analyses that we have undertaken. Finally we comment on use of 3D XPoint memory accessed as system memory rather than storage.
Archive | 2005
Frank T. Hady; Mason B. Cabot; John Beck; Mark B. Rosenbluth
It is noted that network processors (NPs) are an emerging field of programmable processors that are optimized to implement data plane and packet processing networking functions. NPs essentially support a form of a distributed, parallel programming model. They are also optimized for fast packet processing and I/O. The standard benchmarks for network processors do not yet exist thus making measuring, communicating, and comparing NPs performance difficult. Therefore, a definition of a standard set of benchmarks applicable to network processors is required. The goals of such a set of benchmarks include: covering different NP architectures, applicability to different NP application domains, and catering to audiences with different expectations. This chapter proposes a four-layered approach for NP benchmark definition to meet these goals. The chapter describes the utility of the hierarchical approach by specifying and measuring one or more benchmark examples at each level. The measurements are performed on the Intel IXP1200 network processor.
very large data bases | 2010
Annie P. Foong; Bryan E. Veal; Frank T. Hady
Archive | 2004
Mason B. Cabot; Frank T. Hady; Mark B. Rosenbluth