Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Rhonda Kay Gaede is active.

Publication


Featured researches published by Rhonda Kay Gaede.


Cluster Computing | 2001

Hardware-Assisted Characterization of NAS Benchmarks

William E. Cohen; Rhonda Kay Gaede; W. D. Garrett

The UAH Logging, Trace Recording, and Analysis instrumentation (ULTRA) provides highly repeatable (0.0002% variation) application instruction counts for parallel programs which are invariant to the communication network used, the number of processors used, and the MPI communication library used. ULTRA, implemented as an MPI profiling wrapper, avoids the data collection system artifacts of time-based measurements by using instruction counts as the basic measure of work performed and records the operation performed and the amount of data sent for each network operation. These measurements can be scaled appropriately for various target architectures. ULTRAs instrumentation overhead is minimized by using the Pentium II processorss performance monitoring hardware, allowing large, production-run applications to be quickly characterized. Traces of the NAS benchmarks representing 6.67×1012 application instructions were generated by ULTRA. The application instructions executed per byte injected into the network and the instructions executed per message sent were computed from the traces. These values can be scaled by the expected processor performance to estimate the minimum network performance required to support the programs. It is impossible to use time-based measurements for this purpose due to measurement artifacts caused by the background processes and the communication network of the data collection system.


IEEE Transactions on Computers | 2000

An optical bus-based distributed dynamic barrier mechanism

William E. Cohen; David W. Hyde; Rhonda Kay Gaede

Barrier synchronization is a useful parallel programming construct for ensuring that all processors are at a particular location in the code before any processor is allowed to continue. Barrier synchronization is integral to programming models such as the Bulk Synchronous Parallel model. Specialized hardware is often used to improve the performance of a barrier synchronization operation. With continued improvement in processor performance, more efficient synchronization mechanisms are required to counter the rising relative cost of synchronization operations. A high-speed, distributed barrier synchronization mechanism has been developed for broadcast-based optical interconnection networks. This mechanism avoids multiple conversions between optical and electrical signals by having each processor locally decide whether the barrier in which it is participating has been satisfied. It also allows arbitrary sized partitions to be built dynamically during the execution of a program. Simulations of the current hardware design estimate that the barrier synchronization requires less than 300 ns for a 128-processor system.


Proceedings of Second International Workshop on Massively Parallel Processing Using Optical Interconnections | 1995

The simultaneous optical multiprocessor exchange bus

Jeffrey H. Kulick; W. E. Cohen; Constantine Katsinis; E. Wells; Axel Thomsen; Rhonda Kay Gaede; Robert G. Lindquist; Gregory P. Nordin; M. Abushagur; D. Shen

Low latency, high bandwidth interconnection networks that directly link arbitrary pairs of processing elements without contention are very desirable for parallel computers. Most communication networks in parallel machines have made compromises due to the limitations of electronics. Many of the optical interconnection schemes proposed have simply replaced the point-to-point copper wiring with fiber optics and have not made use of the unique properties of optics. This paper proposes an optical interconnect architecture for over a hundred processors, which contains a dedicated channel for each processor to eliminate global arbitration and to provide bandwidth that scales with the number of processors in the machine. Unlike electrical buses, this architecture is not limited by the medium (fiber optics) used to connect the transmitters and receivers. Each processor has an array of receivers, one receiver for each processor channel. The architecture of the receiver array permits a variety of different parallel programming models to be efficiently supported.


ieee symposium on security and privacy | 2007

MEMS-Assisted Cryptography for CPI Protection

Jennifer M. English; David Coe; Rhonda Kay Gaede; David W. Hyde; Jeffrey H. Kulick

The authors present a concept for an anti-tamper system that dynamically generates a cryptographic key derived from microelectromechanical systems (MEMS) arrays encapsulated within a protected system in a single package. The system provides protection in active and passive states with no battery backup.


Software - Practice and Experience | 1999

IN-Tune: an in-situ non-invasive performance tuning tool for multi-threaded Linux on symmetric multiprocessing Pentium workstations

Jeremy B. Rodgers; Rhonda Kay Gaede; Jeffrey H. Kulick

This paper documents the design and implementation of the IN‐Tune software tool suite, which enables a user to collect real‐time code and hardware profiling information on Intel‐based symmetric multiprocessors running the Linux operating system. IN‐Tune provides a virtually non‐invasive tool for performance analysis and tuning of programs. Unlike other analysis tools, IN‐Tune isolates data with respect to individual threads. It also utilizes performance monitoring hardware registers to permit instrumentation of individual threads as they run in‐situ, thus collecting data with appropriate considerations for a multiprocessor environment. Data can be sampled using two different mechanisms. First, the user can collect data by making calls to the system upon the occurrence of specific software events. Secondly, data can be collected at a fixed, fine grain (e.g. 1–10 microseconds) interval using either software or hardware interrupts. To allow observation of codes for which source code modification is impractical or impossible, a ‘shell’ task is created which permits monitoring without code modification. Although this work deals with Intel processors and Linux, the widespread availability of performance monitoring registers in modern processors makes this work widely applicable. Copyright


Optoelectronic interconnects and packaging. Conference | 1997

Optoelectronic design of the simultaneous optical multiprocessor exchange bus (SOME-Bus)

Robert G. Lindquist; Jeffrey H. Kulick; Will E. Cohen; Rhonda Kay Gaede; B. Earl Wells; Mustafa A. G. Abushagur; Dashen Shen; Constantine Katsinis; Stephen T. Kowel

Low latency, high bandwidth interconnecting networks that directly link arbitrary pairs of processing elements without contention are very desirable for parallel computers. The simultaneous optical multiprocessor exchange bus (SOME-Bus) based on a fiber optic interconnect is such a network. The SOME-Bus provides a dedicated channel for each processor for data output and thus eliminates global arbitration. Each processor can receive data simultaneously from all other processors in the system using an array of receivers. The architecture allow for simultaneous multicast and broadcast messages using several processors with zero setup time and no global scheduling. In this paper, we discuss the design of a possible opto-electronic implementation of the SOME-Bus along with an optical power budget analysis. Slant Bragg fiber grains arranged to couple light out of a fiber ribbon cable into an array of amorphous silicon detectors vertically integrated on a silicon are presented as a low cost novel means of interconnecting 10 to 120 processors.


Journal of Non-crystalline Solids | 2000

Amorphous silicon photodetector for optical interconnections

Rhonda Kay Gaede; Fenglei Li; David W. Hyde; Dashen Shen

Abstract This paper discusses the feasibility of using hydrogenated amorphous silicon photodetectors in optical interconnections. Our analysis of the transient carrier transport in amorphous silicon pointed out that individual detectors could operate in the 100 MHz–1 GHz range. The parallelism afforded by fabricating large arrays of receivers can provide communication bandwidth in excess of electronic interconnects.


international test conference | 1986

Calculation of Greatest Lower Bounds Obtainable by the Cutting Algorithm.

Rhonda Kay Gaede; M. Ray Mercer; Bill Underwood


international symposium on parallel architectures algorithms and networks | 1999

Fault-tolerance using cache-coherent distributed shared memory systems

Diana Lynn Hecht; Krishna M. Kavi; Rhonda Kay Gaede; Constantine Katsinis


Archive | 1999

Interconnection Network Independent Characterization of Communication Traffic in the NAS Benchmarks via Processor Performance Monitoring Hardware

W. E. Cohen; Rhonda Kay Gaede; W. D. Garrett

Collaboration


Dive into the Rhonda Kay Gaede's collaboration.

Top Co-Authors

Avatar

Jeffrey H. Kulick

University of Alabama in Huntsville

View shared research outputs
Top Co-Authors

Avatar

William E. Cohen

University of Alabama in Huntsville

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

David W. Hyde

University of Alabama in Huntsville

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dashen Shen

University of Alabama in Huntsville

View shared research outputs
Top Co-Authors

Avatar

David Coe

University of Alabama in Huntsville

View shared research outputs
Top Co-Authors

Avatar

Jeremy B. Rodgers

University of Alabama in Huntsville

View shared research outputs
Top Co-Authors

Avatar

Robert G. Lindquist

University of Alabama in Huntsville

View shared research outputs
Researchain Logo
Decentralizing Knowledge