Daniel Frederic Finchelstein
Nvidia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Daniel Frederic Finchelstein.
IEEE Transactions on Computers | 2005
Benton H. Calhoun; Denis C. Daly; Naveen Verma; Daniel Frederic Finchelstein; David D. Wentzloff; Alice Wang; Seong Hwan Cho; Anantha P. Chandrakasan
This tutorial paper examines architectural and circuit design techniques for a microsensor node operating at power levels low enough to enable the use of an energy harvesting source. These requirements place demands on all levels of the design. We propose architecture for achieving the required ultra-low energy operation and discuss the circuit techniques necessary to implement the system. Dedicated hardware implementations improve the efficiency for specific functionality, and modular partitioning permits fine-grained optimization and power-gating. We describe modeling and operating at the minimum energy point in the subthreshold region for digital circuits. We also examine approaches for improving the energy efficiency of analog components like the transmitter and the ADC. A microsensor node using the techniques we describe can function in an energy-harvesting scenario.
Proceedings of the IEEE | 2010
Anantha P. Chandrakasan; Denis C. Daly; Daniel Frederic Finchelstein; Joyce Kwong; Yogesh K. Ramadass; Mahmut E. Sinangil; Vivienne Sze; Naveen Verma
Energy efficiency of electronic circuits is a critical concern in a wide range of applications from mobile multi-media to biomedical monitoring. An added challenge is that many of these applications have dynamic workloads. To reduce the energy consumption under these variable computation requirements, the underlying circuits must function efficiently over a wide range of supply voltages. This paper presents voltage-scalable circuits such as logic cells, SRAMs, ADCs, and dc-dc converters. Using these circuits as building blocks, two different applications are highlighted. First, we describe an H.264/AVC video decoder that efficiently scales between QCIF and 1080p resolutions, using a supply voltage varying from 0.5 V to 0.85 V. Second, we describe a 0.3 V 16-bit micro-controller with on-chip SRAM, where the supply voltage is generated efficiently by an integrated dc-dc converter.
asian solid state circuits conference | 2008
Nathan Ickes; Daniel Frederic Finchelstein; Anantha P. Chandrakasan
We describe a micropower DSP intended for medium bandwidth microsensor applications (such as acoustic sensing and tracking) which achieves 4 MIPS performance at 40 muW (10 pJ per instruction). Architectural optimizations for energy efficiency include a custom CPU instruction set, miniature instruction cache, hardware accelerator cores for FIR filter and FFT operations, and extensive power gating of both logic and memory.
IEEE Transactions on Circuits and Systems for Video Technology | 2009
Daniel Frederic Finchelstein; Vivienne Sze; Anantha P. Chandrakasan
Performance requirements for video decoding will continue to rise in the future due to the adoption of higher resolutions and faster frame rates. Multicore processing is an effective way to handle the resulting increase in computation. For power-constrained applications such as mobile devices, extra performance can be traded-off for lower power consumption via voltage scaling. As memory power is a significant part of system power, it is also important to reduce unnecessary on-chip and off-chip memory accesses. This paper proposes several techniques that enable multiple parallel decoders to process a single video sequence; the paper also demonstrates several on-chip caching schemes. First, we describe techniques that can be applied to the existing H.264 standard, such as multiframe processing. Second, with an eye toward future video standards, we propose replacing the traditional raster-scan processing with an interleaved macroblock ordering; this can increase parallelism with minimal impact on coding efficiency and latency. The proposed architectures allow N parallel hardware decoders to achieve a speedup of up to a factor of N. For example, if N=3, the proposed multiple frame and interleaved entropy slice multicore processing techniques can achieve performance improvements of 2.64times and 2.91times, respectively. This extra hardware performance can be used to decode higher definition videos. Alternatively, it can be traded-off for dynamic power savings of 60% relative to a single nominal-voltage decoder. Finally, on-chip caching methods are presented that significantly reduce off-chip memory bandwidth, leading to a further increase in performance and energy efficiency. Data-forwarding caches can reduce off-chip memory reads by 53%, while using a last-frame cache can eliminate 80% of the off-chip reads. The proposed techniques were validated and benchmarked using full-system Verilog hardware simulations based on an existing decoder; they should also be applicable to most other decoder architectures. The metrics used to evaluate the ideas in this paper are performance, power, area, memory efficiency, coding efficiency, and input latency.
asian solid state circuits conference | 2008
Daniel Frederic Finchelstein; Vivienne Sze; Mahmut E. Sinangil; Y. Koken; Anantha P. Chandrakasan
The H.264/AVC video coding standard can deliver high compression efficiency at a cost of large complexity and power. The increasing popularity of video capture and playback on portable devices requires that the energy of the video codec be kept to a minimum. This paper proposes several architecture optimizations such as increased parallelism, multiple voltage/frequency domains, and custom voltage-scalable SRAMs that enable low voltage operation and reduce the power of a high-definition decoder. An H.264/AVC Baseline Level 3.1 decoder ASIC was fabricated in 65 nm CMOS and verified. It operates down to 0.7-V and has a measured power of 1.8 mW when decoding a high definition 720 p video at 30 frames per second, which is over an order of magnitude lower than previously published results.
IEEE Design & Test of Computers | 2011
Jo C. Ebergen; Daniel Frederic Finchelstein; Russell Kao; Jon Lexau; David Hopkins
This article presents a case study of a fast and energy-efficient hardware implementation of a stack. The design is highly scalable, as its cycle time remains unchanged and energy per operation grows very slowly, with an increase in the number of storage locations. This design example demonstrates two often-claimed benefits of asynchronous circuit design: the potential for high average-case performance and low power consumption.
Archive | 2006
Anantha P. Chandrakasan; Naveen Verma; J. Kwong; Denis C. Daly; Nathan Ickes; Daniel Frederic Finchelstein; Benton H. Calhoun
Archive | 2013
Daniel Frederic Finchelstein; David Conrad Tannenbaum; Srinivasan Iyer
Archive | 2010
Anantha P. Chandrakasan; Denis C. Daly; Daniel Frederic Finchelstein; Joyce Kwong; Yogesh K. Ramadass; Mahmut E. Sinangil; Vivienne Sze; Naveen Verma
IEEE | 2009
Daniel Frederic Finchelstein; Vivienne Sze; Mahmut E. Sinangil; Anantha P. Chandrakasan