Matt Crawford | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Matt Crawford is active.

Explore More

Publication

Featured researches published by Matt Crawford.

IEEE Communications Letters | 2011

Why Can Some Advanced Ethernet NICs Cause Packet Reordering

Wenji Wu; Phil DeMar; Matt Crawford

The Intel Ethernet Flow Director is an advanced network interface card (NIC) technology. It provides the benefits of parallel receive processing in multiprocessing environments and can automatically steer incoming network data to the same core on which its application process resides. However, our analysis and experiments show that Flow Director can cause packet reordering in multiprocessing environments. In this paper, we use a simplified model to analyze why Flow Director can cause packet reordering. Our experiments verify our analysis.

Computer Networks | 2009

Sorting Reordered Packets with Interrupt Coalescing

Wenji Wu; Phil DeMar; Matt Crawford

TCP performs poorly in networks with serious packet reordering. Processing reordered packets in the TCP-layer is costly and inefficient, involving interaction of the sender and receiver. Motivated by the interrupt coalescing mechanism that delivers packets upward for protocol processing in blocks, we propose a new strategy, Sorting Reordered Packets with Interrupt Coalescing (SRPIC), to reduce packet reordering in the receiver. SRPIC works in the network device driver; it makes use of the interrupt coalescing mechanism to sort the reordered packets belonging to the same TCP stream in a block of packets before delivering them upward; each sorted block is internally ordered. Experiments have proven the effectiveness of SRPIC against forward path reordering.

IEEE Transactions on Parallel and Distributed Systems | 2012

A Transport-Friendly NIC for Multicore/Multiprocessor Systems

Wenji Wu; Phil DeMar; Matt Crawford

Receive side scaling (RSS) is an NIC technology that provides the benefits of parallel receive processing in multiprocessing environments. However, RSS lacks a critical data steering mechanism that would automatically steer incoming network data to the same core on which its application thread resides. This absence causes inefficient cache usage if an application thread is not running on the core on which RSS has scheduled the received traffic to be processed and results in degraded performance. To remedy the RSS limitation, Intels Ethernet Flow Director technology has been introduced. However, our analysis shows that Flow Director can cause significant packet reordering. Packet reordering causes various negative impacts in high-speed networks. We propose an NIC data steering mechanism to remedy the RSS and Flow Director limitations. This data steering mechanism is mainly targeted at TCP. We term an NIC with such a data steering mechanism “A Transport-Friendly NIC” (A-TFN). Experimental results have proven the effectiveness of A-TFN in accelerating TCP/IP performance.

local computer networks | 2010

An evaluation of parallel optimization for OpenSolaris ® network stack

Hongbo Zou Zou; Wenji Wu Wu; Xian-He Sun Sun; Phil DeMar; Matt Crawford

Computing is now shifting towards multiprocessing. The fundamental goal of multiprocessing is improved performance through the introduction of additional hardware threads or cores (referred to as “cores” for simplicity). Modern network stacks can exploit parallel cores to allow either message-based parallelism or connection-based parallelism as a means to enhance performance. OpenSolaris has redesigned and parallelized to better utilize additional cores. Three special technologies, named Softring Set, Soft ring and Squeue are introduced in OpenSolaris for stack parallelization. In this paper, we study the OpenSolaris packet receiving process and its core parallelism optimization techniques. Experiment results show that these techniques allow OpenSolaris to achieve better network I/O performance in multiprocessing environments; however, network stack parallelization has also brought extra overheads for system. An effective and efficient network I/O optimization in multiprocessing environments is required to cross all levers of the network stack from network interface to application.

Presented at 19th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2012) | 2012

Scalability and Performance Improvements in the Fermilab Mass Storage System

Matt Crawford; Catalin Dumitrescu; Dmitry Litvintsev; Alexander Moibenko; Gene Oleynik

By 2009 the Fermilab Mass Storage System had encountered two major challenges: the required amount of data stored and accessed in both tiers of the system (dCache and Enstore) had significantly increased and the number of clients accessing Mass Storage System had increased from tens to hundreds of nodes and from hundreds to thousands of parallel requests. To address these challenges Enstore and the SRM part of dCache were modified to scale for performance, access rates, and capacity. This work increased the amount of simultaneously processed requests in a single Enstore Library instance from about 1000 to 30000. The rates of incoming requests to Enstore increased from tens to hundreds per second. Fermilab is invested in LTO4 tape technology and we have investigated both LTO5 and Oracle T10000C to cope with the increasing needs in capacity. We have decided to adopt T10000C, mainly due to its large capacity, which allows us to scale up the existing robotic storage space by a factor 6. This paper describes the modifications and investigations that allowed us to meet these scalability and performance challenges and provided some perspectives of Fermilab Mass Storage System.

Journal Name: Submitted to J.Phys.Conf.Ser.; Conference: Presented at 18th International Conference on Computing in High Energuy and Nuclear Physics (CHEP 2010), Taipei, Taiwan, 18-22 Oct 2010 | 2011

Horizontally scaling dCache SRM with the Terracotta platform

T Perelmutov; Matt Crawford; Alexander Moibenko; Gene Oleynik

The dCache disk caching file system has been chosen by a majority of LHC experiments Tier 1 centers for their data storage needs. It is also deployed at many Tier 2 centers. The Storage Resource Manager (SRM) is a standardized grid storage interface and a single point of remote entry into dCache, and hence is a critical component. SRM must scale to increasing transaction rates and remain resilient against changing usage patterns. The initial implementation of the SRM service in dCache suffered from an inability to support clustered deployment, and its performance was limited by the hardware of a single node. Using the Terracotta platform, we added the ability to horizontally scale the dCache SRM service to run on multiple nodes in a cluster configuration, coupled with network load balancing. This gives site administrators the ability to increase the performance and reliability of SRM service to face the ever-increasing requirements of LHC data handling. In this paper we will describe the previous limitations of the architecture SRM server and how the Terracotta platform allowed us to readily convert single node service into a highly scalable clustered application.

Computer Communications | 2007