Gregory T. Byrd
North Carolina State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gregory T. Byrd.
IEEE Spectrum | 1995
Gregory T. Byrd; Mark A. Holliday
The authors describe how independent streams of instructions, interwoven on a single processor, fill its otherwise idle cycles and so boost its performance. They detail how such multithreaded architectures take the tack of hiding latency by supporting multiple concurrent streams of execution. When a long-latency operation occurs in one of the threads, another begins execution. In this way, useful work is performed while the time-consuming operation is completed. >
Communications of The ACM | 1995
Daniel S. Stevenson; Nathan Hillery; Gregory T. Byrd
High-speed networking technology and standards have progressed dramatically in the past few years and much attention is now focused on deployment efforts, such as the North Carolina Information Highway (NCIH) [7], and applications. With this shift in emphasis, concerns have been raised about information security. Examples of abuse of the Internet abound and unfortunately ATM networks are subject to many of these same abuses. This is of subtanstial concern when thinking about extending the reach of public data networking to broad segments of society.
Proceedings of the IEEE | 1999
Gregory T. Byrd; Michael J. Flynn
The shared memory abstraction supported by hardware based distributed shared memory (DSM) multiprocessors is an inherently consumer driven means of communication. When a process requires data, it retrieves them from the global shared memory. In distributed cache coherent systems, the data may reside in a remote memory module or in the producers cache. Producer initiated mechanisms reduce communication latency by sending data to the consumer as soon as they are produced. We classify producer initiated mechanisms as implicit or explicit, according to whether the producer must know the identity of the consumer when data are transmitted. Explicit schemes include data forwarding and message passing. Implicit schemes include update based coherence, selective updates, and cache based locks. Several of these mechanisms are evaluated for performance and sensitivity to network parameters, using a common simulated architecture and a set of application kernel benchmarks. StreamLine, a cache based message passing mechanism, provides the best performance on the benchmarks with regular communication patterns. Forwarding write and cache based locks are also among the best performing producer initiated mechanisms. Consumer initiated prefetch, however, has good average performance and is the least expensive to implement.
high-performance computer architecture | 2003
Khaled Z. Ibrahim; Gregory T. Byrd; Eric Rotenberg
Scalability of applications on distributed shared-memory (DSM) multiprocessors is limited by communication overheads. At some point, using more processors to increase parallelism yields diminishing returns or even degrades performance. When increasing concurrency is futile, we propose an additional mode of execution, called slipstream mode, that instead enlists extra processors to assist parallel tasks by reducing perceived overheads. We consider DSM multiprocessors built from dual-processor chip multiprocessor (CMP) nodes with shared L2 cache. A task is allocated on one processor of each CMP node. The other processor of each node executes a reduced version of the same task. The reduced version skips shared-memory stores and synchronization, running ahead of the true task. Even with the skipped operations, the reduced task makes accurate forward progress and generates an accurate reference stream, because branches and addresses depend primarily on private data. Slipstream execution mode yields two benefits. First, the reduced task prefetches data on behalf of the true task. Second, reduced tasks provide a detailed picture of future reference behavior, enabling a number of optimizations aimed at accelerating coherence events, e.g., self-invalidation. For multiprocessor systems with up to 16 CMP nodes, slipstream mode outperforms running one or two conventional tasks per CMP in 7 out of 9 parallel scientific benchmarks. Slipstream mode is 12-19% faster with prefetching only and up to 29% faster with self-invalidation enabled.
international conference on service oriented computing | 2005
Mine Altunay; Douglas E. Brown; Gregory T. Byrd; Ralph A. Dean
Security and trust relationships between services significantly govern their willingness to collaborate and participate in a workflow. Existing workflow tools do not consider such relationships as an integral part of their planning logic: rather, they approach security as a run-time issue. We present a workflow management framework that fully integrates trust and security into the workflow planning logic. It considers not only trust relationships between the workflow requestor and individual services, but also trust relationships among the services themselves. It allows each service owner to define an upper layer of collaboration policies (rules that specify the terms under which participation in a workflow is allowed) and integrates them into the planning logic. Services that are unfit for collaboration due to security violations are replaced at the planning stage. This approach increases the services owners’ control over the workflow path, their willingness for collaboration, and avoids run-time security failures.
sensor networks ubiquitous and trustworthy computing | 2008
Mu-Huan Chiang; Gregory T. Byrd
In dense wireless sensor networks, density control is an important technique for prolonging networks lifetime. However, due to the intrinsic many-to-one communication pattern of sensor networks, nodes close to the sink tend to deplete their energy faster than other nodes. This unbalanced energy usage among nodes significantly reduces the network lifetime. In this paper, we propose neighborhood-aware density control (NADC) to alleviate this undesired effect by reducing unnecessary overhearing along routing paths. In NADC, nodes observe their neighborhoods and dynamically adapt their participation in the multihop network topology. Since the neighborhood information can be easily observed through the overheard information, the density in different regions can be adaptively adjusted in a totally distributed manner. Simulation experiments demonstrate that NADC alleviates the extremely unbalanced workload and extends the effective network lifetime without significant increase in data delivery latency.
computing frontiers | 2009
Salil Mohan Pant; Gregory T. Byrd
Transactional Memory (TM) is an optimistic speculative synchronization scheme that provides atomic execution for a region of code marked as a transaction by the programmer. TM avoids many of the problems associated with lock-based synchronization and can make writing parallel programs relatively easier. Programs with critical sections that are not heavily contended benefit from the optimistic nature of TM systems. However, for heavily contended critical sections, performance can degrade due to conflicts leading to stalls and expensive rollbacks. In this paper, we look into the nature of the shared data involved in conflicts for TM systems. We find that most transactions have conflicts around a few shared addresses, and shared-conflicting data is often updated in a predictable manner by different transactions. We propose using a memory-level value predictor to capture this predictability for such data structures and increase overall concurrency by satisfying loads from conflicting transactions with predicted values, instead of stalling. In this paper, we present one possible design and implementation of a TM system with a value predictor. Our benchmark results show that the value predictor can capture this predictable behavior for most benchmarks and can improve performance of TM programs by improving concurrency and minimizing stalls and rollbacks.
international conference on computer communications and networks | 2001
Rong Wang; Feiyi Wang; Gregory T. Byrd
Intrusion detection research has so far mostly concentrated on techniques that effectively identify malicious behavior. No assurance can be assumed once the system is compromised. Intrusion tolerance, on the other hand, focuses on providing minimal level of services even when some components have been partially compromised. The challenges here are how to take advantage of fault tolerant techniques in the intrusion tolerant system context and how to deal with possible unknown attacks and compromised components so as to continue providing the service. This paper presents our work on applying one important fault tolerance technique, acceptance testing, for building scalable intrusion tolerant systems. First, we propose a general methodology for designing acceptance tests. An acceptance monitor architecture is proposed to apply various tests for detecting compromises based on the impact of the attacks. Second, we make a comprehensive vulnerability analysis on typical commercial-off-the-shelf (COTS) Web servers. Various acceptance testing modules are implemented to show the effectiveness of the proposed approach. By utilizing the fault tolerance techniques on intrusion tolerance system, we provide a mechanism for building reliable distributed services that are more resistant to both known and unknown attacks.
International Journal of Sensor Networks | 2009
Mu Huan Chiang; Gregory T. Byrd
Data aggregation reduces energy consumption by reducing the number of message transmissions in sensor networks. Effective aggregation requires that event messages be routed along common paths. While existing routing protocols provide many ways to construct the aggregation tree, this opportunistic style of aggregation is usually not optimal. The Minimal Steiner Tree (MST) maximises the possible degree of aggregation, but finding such a tree requires global knowledge of the network, which is not practical in sensor networks. In this paper, we propose the Adaptive Aggregation Tree (AAT) to dynamically transform the structure of the routing tree to improve the efficiency of data aggregation. It adapts to changes in the set of source nodes automatically, and approaches the cost savings of MST without explicit maintenance of an infrastructure. The evaluation results show that AAT reduces the communication energy consumption by 23%, compared to shortest-path tree, and by 31%, compared to GPSR.
Network Processor Design | 2003
Deepak Suryanarayanan; John William Marshall; Gregory T. Byrd
Network processors (NPs) are emerging new class of processors that combine programmable ASICs and microprocessors to implement adaptive network services. NPs influence the flexibility of software solutions with the high performance of custom hardware. The development of such sophisticated hardware requires a holistic methodology that can facilitate the study of network processors and their performance with different networking applications and traffic conditions. It is noted that this combination of study techniques is essentially accomplished in the component network simulator (ComNetSim). The simulator includes both a traffic-modeling component and a detailed architectural framework that allows the study of complete networking applications under varying network traffic conditions. The chapter illustrates a weighted round robin scheduling algorithm, adapted to the Toaster architecture. It describes high-level simulator design and details the Toaster network processor and the implementation of the simulator including the cycle-accurate model of the Toaster architecture. The chapter also briefly presents the simulator organization along with performance results and analysis.