Greg M. Link | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Greg M. Link is active.

Explore More

Publication

Featured researches published by Greg M. Link.

ieee computer society annual symposium on vlsi | 2004

Fault tolerant algorithms for network-on-chip interconnect

Matthew Pirretti; Greg M. Link; Richard R. Brooks; Narayanan Vijaykrishnan; Mahmut T. Kandemir; Mary Jane Irwin

As technology scales, fault tolerance is becoming a key concern in on-chip communication. Consequently, this work examines fault tolerant communication algorithms for use in the NoC domain. Two different flooding algorithms and a random walk algorithm are investigated. We show that the flood-based fault tolerant algorithms have an exceedingly high communication overhead. We find that the redundant random walk algorithm offers significantly reduced overhead while maintaining useful levels of fault tolerance. We then compare the implementation costs of these algorithms, both in terms of area as well as in energy consumption, and show that the flooding algorithms consume an order of magnitude more energy per message transmitted.

international symposium on quality electronic design | 2006

Interconnect and Thermal-aware Floorplanning for 3D Microprocessors

Wei-Lun Hung; Greg M. Link; Yuan Xie; Narayanan Vijaykrishnan; Mary Jane Irwin

Interconnects are becoming an increasing problem from both performance and power consumption perspective in future technology nodes. The introduction of 3D chip architectures, with their intrinsic capability of reducing wire length, is one of the promising solutions to mitigate the interconnect problem. While interconnect power consumption reduces due to the adoption of 3D designs, the stacking of multiple active layers leads to higher power densities. Thus, high peak temperatures are of major concern in 3D designs. Consequently, we present a thermal-aware floorplanner for 3D architectures. In contrast to most prior work, our floorplanner considers the interconnect power consumption in exploring a thermal-aware floorplan. Our results show that excluding interconnect power can result in peak temperatures being underestimated by as much as 15degC in 90nm technology. Finally, we demonstrate that our floorplanner is effective in lowering peak temperatures using a microprocessor design and four MCNC designs as benchmarks

international conference on vlsi design | 2004

Embedded hardware face detection

Theo Theocharides; Greg M. Link; Narayanan Vijaykrishnan; Mary Jane Irwin; Wayne H. Wolf

Face detection is the first step towards face recognition and is a vital task in surveillance and security applications. Current software implementations of face detection algorithms lack the computational ability to support detection in real time video streams. Consequently, this work focuses on the design of special-purpose hardware for performing rotation invariant face detection. The synthesized design using 160 nm technology is found to operate at 409.5 kHz providing a throughput of 424 frames per second and consumes 7 Watts of power. The synthesized design provided 75% accuracy in detecting faces from a set of 55 images that is competitive with existing software implementations that provide around 80-85% accuracy.

international symposium on quality electronic design | 2006

Thermal Trends in Emerging Technologies

Greg M. Link; Narayanan Vijaykrishnan

In the future, the peak temperature of a chip will be a primary design constraint. In order to meet this constraint, temperature must be considered in the earliest phases of the design process. Using a newly developed thermal analysis tool, HS3d, this work explores the thermal profile of devices as technology varies. We show that as technology scales, the hotspot locations can shift from the units with the most switching activity to those with the most low-threshold transistors. We further note that process variations in leakage dominated technologies can result in significant variations in the hotspot locations, indicating that feedback from thermal sensors will be very important. Finally, this work examines the thermal effects of multi-layer device stacking technologies, and finds that the vertical temperature difference between layers is much less significant than the horizontal differences due to power density, and as such, vertical placement optimizations will have much smaller impact on hotspot development than a uniform power distribution

international conference on computer design | 2005

Temperature-aware voltage islands architecting in system-on-chip design

Wei-Lun Hung; Greg M. Link; Yuan Xie; Narayanan Vijaykrishnan; N. Dhanwadaf; J. Conner

As technology scales, power consumption and thermal effects have become challenges for system-on-chip designers. The rising on-chip temperatures can have negative impacts on SoC performance, power, and reliability. In view of this, we present a hybrid optimization approach which aims at temperature reduction and hot spot elimination. We demonstrate that considerable improvement in the thermal distribution of a design can be achieved through careful voltage island partitioning, voltage level assignment, and voltage island floorplanning. The experimental results on MCNC benchmarks show significant improvement on the thermal profiles. To the best of our knowledge, this is the first work to explore the thermal impacts of voltage islands.

design, automation, and test in europe | 2005

Hotspot Prevention Through Runtime Reconfiguration in Network-On-Chip

Greg M. Link; Narayanan Vijaykrishnan

Many existing thermal management techniques focus on reducing the overall power consumption of the chip, and do not address location-specific temperature problems referred to as hotspots. We propose the use of dynamic runtime reconfiguration to shift the hotspot-inducing computation periodically and make the thermal profile more uniform. Our analysis shows that dynamic reconfiguration is an effective technique in reducing hotspots for NoC.

IEEE Transactions on Computers | 2005

A holistic approach to designing energy-efficient cluster interconnects

Eun Jung Kim; Greg M. Link; Ki Hwan Yum; Narayanan Vijaykrishnan; Mahmut T. Kandemir; Mary Jane Irwin; Chita R. Das

Designing energy-efficient clusters has recently become an important concern to make these systems economically attractive for many applications. Since the cluster interconnect is a major part of the system, the focus of this paper is to characterize and optimize the energy consumption in the entire interconnect. Using a cycle-accurate simulator of an InfiniBand Architecture (IBA) compliant interconnect fabric and actual designs of its components, we investigate the energy behavior on regular and irregular interconnects. The energy profile of the three major components (switches, network interface cards (NICs), and links) reveals that the links and switch buffers consume the major portion of the power budget. Hence, we focus on energy optimization of these two components. To minimize power in the links, first we investigate the dynamic voltage scaling (DVS) algorithm and then propose a novel dynamic link shutdown (DLS) technique. The DLS technique makes use of an appropriate adaptive routing algorithm to shut down the links intelligently. We also present an optimized buffer design for reducing leakage energy in 70nm technology. Our analysis on different networks reveals that, while DVS is an effective energy conservation technique, it incurs significant performance penalty at low to medium workload. Moreover, energy saving with DVS reduces as the buffer leakage current becomes significant with 70nm design. On the other hand, the proposed DLS technique can provide optimized performance-energy behavior (up to 40 percent energy savings with less than 5 percent performance degradation in the best case) for the cluster interconnects.

international conference on vlsi design | 2005

Implementing LDPC decoding on network-on-chip

Theo Theocharides; Greg M. Link; Narayanan Vijaykrishnan; Mary Jane Irwin

Low-density parity check codes are a form of error correcting codes used in various wireless communication applications and in disk drives. While LDPC codes are desirable due to their ability to achieve near Shannon-limit communication channel capacity, the computational complexity of the decoder is a major concern. LDPC decoding consists of a series of iterative computations derived from a message-passing bipartite graph. In order to efficiently support the communication intensive nature of this application, we present a LDPC decoder architecture based on a network-on-chip communication fabric that provides a 1.2 Gbps decoded throughput rate for a 3/4 code rate, 1024-bit block LDPC code. The proposed architecture can be reconfigured to support other LDPC codes of different block sizes and code rates. We also propose two novel power-aware optimizations that reduce the power consumption by up to 30%.

asia and south pacific design automation conference | 2005

FD-HGAC: a hybrid heuristic/genetic algorithm hardware/software co-synthesis framework with fault detection

J. Conner; Yuan Xie; Mahmut T. Kandemir; Greg M. Link; Robert P. Dick

Embedded real-time systems are becoming increasingly complex. To combat the rising design cost of those systems, co-synthesis tools that map tasks to systems containing both software and specialized hardware have been developed. As system transient fault rates increase due to technology scaling, embedded systems must be designed in fault tolerant ways to maintain system reliability. This paper presents and analyzes FD-HGAC, a tool using a genetic algorithm and heuristics to design real-time systems with partial fault detection. Results of numerous trials of the tool are shown to produce systems with average 22% detection coverage that incurs no cost or performance penalty.

international parallel and distributed processing symposium | 2007

Load Miss Prediction - Exploiting Power Performance Trade-offs

Konrad Malkowski; Greg M. Link; Padma Raghavan; Mary Jane Irwin

Modern CPUs operate at GHz frequencies, but the latencies of memory accesses are still relatively large, in the order of hundreds of cycles. Deeper cache hierarchies with larger cache sizes can mask these latencies for codes with good data locality and reuse, such as structured dense matrix computations. However, cache hierarchies do not necessarily benefit sparse scientific computing codes, which tend to have limited data locality and reuse. We therefore propose a new memory architecture with a load miss predictor (LMP), which includes a data bypass cache and a predictor table, to reduce access latencies by determining whether a load should bypass the main cache hierarchy and issue an early load to main memory. Our architecture uses the L2 (and lower caches) as a victim cache for data removed from our bypass cache. We use cycle-accurate simulations, with SimpleScalar and Wattch to show that our LMP improves the performance of sparse codes, our application domain of interest, on average by 14%, with a 13.6% increase in power. When the LMP is used with dynamic voltage and frequency scaling (DVFS), performance can be improved by 8.7% with system power savings of 7.3% and energy reduction of 17.3% at 1800 MHz relative to the base system at 2000 MHz. Alternatively our LMP can be used to improve the performance of SPEC benchmarks by an average of 2.9 % at the cost of 7.1 % increase in average power.

Explore More