Is this you? Create Your Porfile

Alex Nicolau

University of California, Irvine

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alex Nicolau is active.

Explore More

Publication

Featured researches published by Alex Nicolau.

design, automation, and test in europe | 2002

Automatic Verification of In-Order Execution In Microprocessors with Fragmented Pipelines and Multicycle Functional Units

Prabhat Mishra; Hiroyuki Tomiyama; Nikil D. Dutt; Alex Nicolau

As embedded systems continue to face increasingly higher performance requirements, deeply pipelined processor architectures are being employed to meet desired system performance. System architects critically need modeling techniques that allow exploration, evaluation, customization and validation of different processor pipeline configurations, tuned for a specific application domain. We propose a novel finite state machine (FSM) based modeling of pipelined processors and define a set of properties that can be used to verify the correctness of in-order execution in the presence of fragmented pipelines and multicycle functional units. Our approach leverages the system architects knowledge about the behavior of the pipelined processor through architecture description language (ADL) constructs, and thus allows a powerful top-down approach to pipeline verification. We applied this methodology to the DLX processor to demonstrate the usefulness of our approach.

design, automation, and test in europe | 2012

VaMV : variability-aware memory virtualization

Luis Angel D. Bathen; Nikil D. Dutt; Alex Nicolau; Puneet Gupta

Power consumption variability of both on-chip SRAMs and off-chip DRAMs is expected to continue to increase over the next decades. We opportunistically exploit this variability through a novel Variability-aware Memory Virtualization (VaMV) layer that allows programmers to partition their applications address space (through annotations) into virtual address regions and create mapping policies for each region. Each policy has different requirements (e.g., power, fault-tolerance) and is exploited by our dynamic memory management module (VaMVisor), which adapts to the underlying hardware, prioritizes the memory resources according to their characteristics (e.g., power consumption), and selectively maps data to the best-fitting memory resource (e.g., high-utilization data to low-power memory space). Our experimental results on embedded benchmarks show that VaMV is capable of reducing dynamic power consumption by 63% on average while reducing total execution time by an average of 34% by exploiting: 1) SRAM voltage scaling, 2) DRAM power variability, and 3) Efficient dynamic policy-driven variability-aware memory allocation.

design automation conference | 2014

Power / Capacity Scaling: Energy Savings With Simple Fault-Tolerant Caches

Mark Gottscho; Abbas BanaiyanMofrad; Nikil D. Dutt; Alex Nicolau; Puneet Gupta

Complicated approaches to fault-tolerant voltage-scalable (FTVS) SRAM cache architectures can suffer from high overheads. We propose static (SPCS) and dynamic (DPCS) variants of power/capacity scaling, a simple and low-overhead fault-tolerant cache architecture that utilizes insights gained from our 45nm SOI test chip. Our mechanism combines multi-level voltage scaling with power gating of blocks that become faulty at each voltage level. The SPCS policy sets the runtime cache VDD statically such that almost all of the cache blocks are not faulty. The DPCS policy opportunistically reduces the voltage further to save more power than SPCS while limiting the impact on performance caused by additional faulty blocks. Through an analytical evaluation, we show that our approach can achieve lower static power for all effective cache capacities than a recent complex FTVS work. This is due to significantly lower overheads, despite the failure of our approach to match the min-VDD of the competing work at fixed yield. Through architectural simulations, we find that the average energy saved by SPCS is 55%, while DPCS saves an average of 69% of energy with respect to baseline caches at 1 V. Our approach incurs no more than 4% performance and 5% area penalties in the worst case cache configuration.

international conference on hardware/software codesign and system synthesis | 2012

ViPZonE: OS-level memory variability-driven physical address zoning for energy savings

Luis Angel D. Bathen; Mark Gottscho; Nikil D. Dutt; Alex Nicolau; Puneet Gupta

ITRS predicts that over the next decade, hardware power variation will increase at alarming rates. As a result, designers must build software that can adapt to and exploit these variations to reduce power consumption and improve system performance. This paper presents ViPZonE, a system-level solution that opportunistically exploits DRAM power variation through physical address zoning. ViPZonE is composed of a variability-aware software stack that allows developers to indicate to the OS the expected dominant usage patterns (write or read) as well as level of utilization (high, medium, or low) through high-level APIs. ViPZonEs variability-aware page allocator, implemented in the Linux kernel, is responsible for interpreting these high-level requests for memory and transparently mapping them to physical address zones with different power consumption. Our experimental results across various configurations running PAR-SEC workloads show an average of 13.1% memory power consumption savings at the cost of a modest 1.03% increase in execution time over a typical Linux virtual memory allocator.

international parallel and distributed processing symposium | 2003

FORGE: a framework for optimization of distributed embedded systems software

Radu Cornea; Nikil D. Dutt; Rajesh K. Gupta; Ingolf Krueger; Alex Nicolau; Doug Schmidt; Sandeep K. Shukla

FORGE brings together a number of advances in architectural modeling, software architecture and distributed/real-time systems to build a platform that provides two fundamental capabilities for distributed, real time, and embedded (DRE) system development: (a) conceptualization and coding of the design knowledge through collaborative specifications that are inherently matched to distributed solutions; and (b) exploitation of the design knowledge across all development phases for the DRE systems. Our proof-of-concept FORGE prototype is built upon collaborative specifications captured by extensions to the message sequence charts (MSCs) that drive the customization of CompOSEIQ middleware services and generate node-architecture specific code through descriptions of the architecture and resources captured using ADL and RDL respectively.

asia and south pacific design automation conference | 2002

Automatic modeling and validation of pipeline specifications driven by an architecture description language [SoC]

Prabhat Mishra; Ashok Halambi; Peter Grun; Nikil D. Dutt; Alex Nicolau; Hiroyuki Tomiyama

Verification is one of the most complex and expensive tasks in the current systems-on-chip (SOC) design process. Many existing approaches employ a bottom-up approach to pipeline validation, where the functionality of an existing pipelined processor is, in essence, reverse-engineered from its RT-level implementation. Our approach leverages the system architects knowledge about the behavior of the pipelined architecture, through architecture description language (ADL) constructs, and thus allows a powerful top-down approach to pipeline validation. This paper addresses automatic validation of processor, memory, and co-processor pipelines described in an ADL. We present a graph-based modeling of architectures which captures both structure and behavior of the architecture. Based on this model, we present formal approaches for automatic validation of the architecture described in the ADL. We applied our methodology to verify several realistic architectures from different architectural domains to demonstrate the usefulness of our approach.

international conference on hardware/software codesign and system synthesis | 2005

Aggregating processor free time for energy reduction

Alex Nicolau; Nikil D. Dutt; Eugene Earlie; Aviral Shrivastava

Even after carefully tuning the memory characteristics to the application properties and the processor speed, during the execution of real applications there are times when the processor stalls, waiting for data from the memory. Processor stall can be used to increase the throughput by temporarily switching to a different thread of execution, or reduce the power and energy consumption by temporarily switching the processor to low-power mode. However, any such technique has a performance overhead in terms of switching time. Even though over the execution of an application the processor is stalled for a considerable amount of time, each stall duration is too small to profitably perform any state switch. In this paper, we present code transformations to aggregate processor free time. Our experiments on the Intel XScale and Stream kernels show that up to 50,000 processor cycles can be aggregated, and used to profitably switch the processor to low-power mode. We further show that our code transformations can switch the processor to low-power mode for up to 75% of kernel runtime, achieving up to 18% of processor energy savings on multimedia applications. Our technique requires minimal architectural modifications and incurs negligible ( < 1%) performance loss.

Mobile Computing and Communications Review | 2006

PBPAIR: an energy-efficient error-resilient encoding using probability based power aware intra refresh

Minyoung Kim; Hyunok Oh; Nikil D. Dutt; Alex Nicolau; Nalini Venkatasubramanian

Error resilient encoding in video communication is becoming increasingly important due to data transmission over unreliable channels. In this paper, we propose a new power-aware error resilient coding scheme based on network error probability and user expectation in video communication using mobile handheld devices. By considering both image content and network conditions, we can achieve a fast recoverable and energy-efficient error resilient coding scheme. More importantly, our approach allows system designers to evaluate various operating points in terms of error resilient level and energy consumption over a wide range of system operating conditions. We have implemented our scheme on an H.263 video codec algorithm, compared it with the previous AIR, GOP and PGOP coding schemes, and measured energy consumption and video quality on the IPAQ and Zaurus PDAs. Our experimental results show that our approach reduces energy consumption by 34%, 24% and 17% compared with AIR, GOP and PGOP schemes respectively, while incurring only a small fluctuation in the compressed frame size. In addition, our experimental results prove that our approach allows faster error recovery than the previous AIR, GOP and PGOP approaches. We believe our error resilient coding scheme is therefore eminently applicable for video communication on energy-constrained wireless mobile handheld devices.

design automation conference | 2014

Multi-Layer Memory Resiliency

Nikil D. Dutt; Puneet Gupta; Alex Nicolau; Abbas BanaiyanMofrad; Mark Gottscho; Majid Shoushtari

With memories continuing to dominate the area, power, cost and performance of a design, there is a critical need to provision reliable, high-performance memory bandwidth for emerging applications. Memories are susceptible to degradation and failures from a wide range of manufacturing, operational and environmental effects, requiring a multi-layer hardware/software approach that can tolerate, adapt and even opportunistically exploit such effects. The overall memory hierarchy is also highly vulnerable to the adverse effects of variability and operational stress. After reviewing the major memory degradation and failure modes, this paper describes the challenges for dependability across the memory hierarchy, and outlines research efforts to achieve multi-layer memory resilience using a hardware/software approach. Two specific exemplars are used to illustrate multi-layer memory resilience: first we describe static and dynamic policies to achieve energy savings in caches using aggressive voltage scaling combined with disabling faulty blocks; and second we show how software characteristics can be exposed to the architecture in order to mitigate the aging of large register files in GPGPUs. These approaches can further benefit from semantic retention of application intent to enhance memory dependability across multiple abstraction levels, including applications, compilers, run-time systems, and hardware platforms.

embedded systems for real-time multimedia | 2006

Annotation Based Multimedia Streaming Over Wireless Networks

Radu Cornea; Alex Nicolau; Nikil D. Dutt

The relatively high power consumption of wireless network interfaces represents an important detriment in multimedia streaming for mobile devices. The IEEE 802.11 built-in power saving mode was designed for transfers of different nature and is not able to take advantage of the short idle intervals and continuous, periodic transmissions inherent in multimedia streaming. We propose an annotation based approach to wireless network power management that analyzes the variations in data transfer bandwidth during playback and uses the results to buffer data into larger burst transmissions with longer idle periods when the network card is transitioned into a lower power, sleep mode. Annotations allow for energy savings of up to 75% for the network interface, with practically no quality degradation or packet loss, only a small delay due to the buffer

Explore More