Mauro Chinosi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mauro Chinosi is active.

Explore More

Publication

Featured researches published by Mauro Chinosi.

international electron devices meeting | 1994

Flash-based programmable nonlinear capacitor for switched-capacitor implementations of neural networks

Alan Kramer; Marco Sabatini; Roberto Canegallo; Mauro Chinosi; Pierluigi Rolandi; P. Zabberoni

The use of flash devices for both analog storage and analog computation can result in highly efficient switched-capacitor implementations of neural networks. The standard flash device suffers from severe limitations in this application due to relatively large parasitic overlap capacitances. This paper introduces the computational concept, circuit and architecture we are exploring as well as a novel flash-based programmable nonlinear capacitor with much improved charge domain characteristics for our application. These devices are demonstrated in a novel circuit consisting of only two devices and capable of computing a 5-bit absolute-value-of-difference at an energy consumption of less than 1 pJ.<<ETX>>

international symposium on low power electronics and design | 1998

Automatic characterization and modeling of power consumption in static RAMs

Mauro Chinosi; Roberto Zafalon; Carlo Guardiani

An automatic modeling technique is presented in this paper that allows one to build an accurate model of power consumption in embedded memory blocks. A software neural-network is used to create a regression tree by automatically splitting those variables that have a discontinuous effect on the power consumption. An application of the methodology to the modeling of a 0.35 /spl mu/m CMOS embedded SRAM is presented.

Design Automation for Embedded Systems | 2002

A Framework for Modeling and Estimating the Energy Dissipation of VLIW-Based Embedded Systems

Luca Benini; Davide Bruni; Mauro Chinosi; Cristina Silvano; Vittorio Zaccaria; Roberto Zafalon

This paper describes a technique for modeling and estimating the power consumptionat the system-level for embedded VLIW (Very Long Instruction Word) architectures.The method is based on a hierarchy of dynamic power estimationengines: from the instruction-level down to the gate/transistor-level. Powermacro-models have been developed for the main components of the system: theVLIW core, the register file, the instruction and data caches. The main goalis to define a system-level simulation framework for the dynamic profilingof the power behavior during the software execution, providing also a break-downof the power contributions due to the single components of the system. Theproposed approach has been applied to the Lx family of scalable embedded VLIWprocessors, jointly designed by STMicroelectronics and HPLabs. Experimentalresults, carried out over a set of benchmarks for embedded multimedia applications,have demonstrated an average accuracy of 5% of the instruction-level estimationengine with respect to the RTL engine, with an average speed-up of four ordersof magnitude.

international symposium on low power electronics and design | 1995

Ultra-low-power analog associative memory core using flash-EEPROM-based programmable capacitors

Alan Kramer; Roberto Canegallo; Mauro Chinosi; D. Doise; Giovanni Gozzini; Pier Luigi Rolandi; Marco Sabatini; P. Zabberoni

Analog techniques can lead to ultra-efficient computational systems when applied to the right applications. The problem of associative memory is well suited to array-based analog implementation. The architectures which result can be ultra efficient both in terms of high density and low power consumption. We have implemented a small (16x512) analog associative memory array which uses programmable nonlinear capacitors based on flash EEPROM technology for both analog storage and analog Manhattan Distance computation. The core circuit involved is based on only two of these novel devices. Preliminary results from this test circuit indicate that we can achieve a computing precision of more than 8 digitalequivalent bits in a chip which is capable of performing 128 Giga absolute-value-of-difference-accumulate operations per second at a power consumption of less than 150 mW. Performance of this level is more than an order of magnitude more efficient than the best low-power digital techniques and demonstrates the potential advantages analog implementation has to offer when applied to certain applications. Introduction Associative Memory The function of an associative memory, or contentaddressable memory, is more or less the inverse of that of a random access memory: when presented with a partial or complete data vector, the memory should return the row address of the internally stored data vector which best “matches” the input data vector. The matching function is typically a distance function; in standard digital implementations Hamming distance is usually used. Associative memory lends itself to array-based parallel implementation. A typical architecture consists of a 2dimensional distance-computing / memory array, and several 1-dimensional arrays including an accumulator array for accumulating distances, a comparator array for finding the smallest distance, a priority encoder array for selecting rows one at a time, and a ROM array for presenting outputs [5]. * This work has been partially sponsored by U. C. Berkeley where Mr. Kramer is completing a Ph.D. Analog Associative Memory We are exploring an analog implementation of this ype of architecture for Associative Memory. The result is an analog associative memory in which both stored memory rows and inputs consist of analog-valued vectors (5-bit equivalent precision). The goal is to achieve an ultraefficient design in terms of both density and power consumption. Our target is an associative memory containing 4K lines of 64-dimensional memory vectors and capable of performing nearest neighbor match based on Manhattan Distance in less than 2uS at a power consumption of less than 150mW. Computation of 4K 64-dimensional Manhattan Distances requires 256K 5-bit absolute-value-of-differenceaccumulate computations, thus achieving a cycle time of 2uS requires performing 128G of these operations per second. Performing this much computation on a single chip at a power consumption of less than 150mW represents an increase in efficiency both in terms of density and power consumption of more than an order of magnitude over the best low-power digital techniques [1]. Practical realization of computing systems based on analog techniques may provide a viable alternative for ultraefficient system design if the design generality lost can be justified by the added efficiency gained.

international conference on document analysis and recognition | 1997

Words recognition using associative memory

Loris Giuseppe Navoni; Roberto Canegallo; Mauro Chinosi; Giovanni Gozzini; Alan Kramer; Pier Luigi Rolandi

Introduces the application of an analog associative memory chip to word recognition, which is a fundamental topic of the text recognition process. The word recognition method takes advantage of a statistical evaluation of the behavior of the optical character recognition system preceding it. That statistical information leads to the creation of a coding that is used to store a lexicon of the most used words in the chip. An input pattern is matched against the full database of the associative memory, and a set of closest patterns is returned. The precision reached by this operation ranges from 93% to 99%. These encouraging results demonstrate the general aptitude of the chip to solve classes of problems that need to use an associative memory.

design automation conference | 1999

Parallel mixed-level power simulation based on spatio-temporal circuit partitioning

Mauro Chinosi; Roberto Zafalon; Carlo Guardiani

In this work we propose a technique for spatial and temporal partitioning of a logic circuit based on the nodes activity computed by using a simulation at an higher level of abstraction. Only those components that are activated by a given input vector are added to the detailed simulation netlist. The methodology is suitable for parallel implementation on a multi-processor environment and allows us to arbitrarily switch between fast and detailed levels of abstraction during the simulation run. The experimental results obtained on a significant set of benchmarks show that it is possible to obtain a considerable reduction in both CPU time and memory occupation together with a considerable degree of accuracy. Furthermore, the proposed technique easily fits in the existing industrial design flows.

power and timing modeling optimization and simulation | 2000

Architectural Design Space Exploration Achieved through Innovative RTL Power Estimation Techniques

Manuela Anton; Mauro Chinosi; Daniele Sirtori; Roberto Zafalon

Todays design community need tools that address early power estimation, making it possible to find the optimal design trade-offs without respinning to explore the whole chip. Several approaches based on a fast (coarse) logic synthesis step, in order to analyze power on the mapped gate-level netlist and then create suitable power models have been published in the last years. In this paper we present some applications of RTPow, a proprietary tool dealing with the RT-level power estimation. The innovative estimation engine that does not perform any type of on-the-fly logic synthesis, but analyze the HDL description from the functionality point of view, permits a drastic time saving. Besides this top-down estimation, RTPow is able to perform a series of power macromodels and the bottom-up approach that enable an effective power budgeting. The first is an Adaptive Gaussian Noise Filter (28K Eq.Gate), described in VHDL, the second is a Motion Estimation and Compensation Device for Video Field Rate Doubling Application (171K Eq.Gate) also described in VHDL. The third is a micro-processor core (111K Eq.Gate) described using Verilog language.

international conference on microelectronics | 1996

A low-power high-precision tunable WINNER-TAKE-ALL network

Roberto Canegallo; Mauro Chinosi; Alan Kramer

This paper describes a low power CMOS circuit for selecting the greatest of n analog voltages within a tunable selection range. An increasing speed-decreasing precision law is used to determine the amplitude of the selection range. 16 mV to 4 mV resolution, over a 2 V to 4 V dynamic input range, can be obtained by reducing the speed from 2 MHz to 500 kHz. 1 /spl mu/A quiescent current, 2 /spl mu/A AC current for the selected cells and small size make this circuit available for VLSI implementations of massively parallel analog computational circuits.

Archive | 1997