P. Faraboschi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where P. Faraboschi is active.

Explore More

Publication

Featured researches published by P. Faraboschi.

Proceedings of the IEEE | 1995

Hardware solutions for fuzzy control

A. Costa; A. De Gloria; P. Faraboschi; A. Pagni; G. Rizzotto

A large fraction of software designs using microcontrollers is today adopting fuzzy logic algorithms and this fraction is likely to increase in the future. Hardware implementation of fuzzy logic ranges from standard microprocessors to dedicated ASICs and each different approach is targeted to a different application domain or market area. In this paper, we present an overview of the computational complexity of the fuzzy inference process and the various techniques adopted for fuzzy control tasks, highlighting the tradeoffs that can guide a system designer toward correct choices according to application features and cost/performance issues. In addition, we detail three case studies of architectures that address three different market segments in the fuzzy hardware scenario: dedicated fuzzy coprocessors, RISC processors with specialized fuzzy support and application specific fuzzy ASICs. >

international symposium on microarchitecture | 1993

An analysis of dynamic scheduling techniques for symbolic applications

A. Costa; Alessandro De Gloria; P. Faraboschi; Mauro Olivieri

Instruction-level parallelism in a single stream of code for non-numerical applications has been the subject of many recent researches. This work extends the analysis to symbolic applications described with logic programming. In particular, the authors analyze the effects on performance of speculative execution, memory alias disambiguation, renaming and flow prediction. The obtained results indicate that one can reach a sustained parallelism of 4 (comparable with imperative languages), with the proper optimizations. The authors also show a comparison between static and dynamic scheduled approaches, outlining the conditions under which a dynamic solution can reach substantial improvements over a static one. In this way, they point out some important optimizations and parameters of a dynamic scheduling approach, indicating a guideline for future architectural implementations. >

annual european computer conference | 1992

A programmable instruction format extension to VLIW architectures

A. De Gloria; P. Faraboschi

While very long instruction word (VLIW) architectures permit static extraction of a valuable amount of concurrency, their major drawback lies in the considerable code memory size requirements, due to the horizontal nature of the instruction set. To overcome this inefficiency, the authors propose a programmable instruction format extension, where the compiler is responsible for the choice of the best combinations of operations which are allowed to be concurrently executed. This results in a substantial saving of instruction bits, at the only expense of some additional memory for decoding circuitry. An applicative example on a sample architecture shows how performance decay is strongly limited also when the instruction width is reduced by a factor of three.<<ETX>>

international symposium on microarchitecture | 1990

An evaluation system for application specific architectures

Alessandro De Gloria; P. Faraboschi

Application specific architectures are assuming an important role in the design of tailored systems as they enable a better cost/performance ratio, by exploiting application intrinsic features, with respect to standard components. An ASA design environment has been developed in order to allow the evaluation of different architecture solutions in terms of cost and performance. The system deals with parallel synchronous non-homogeneous architectures and, starting from the high-level description of the application benchmarks, reaches code generation and simulation of architectures whose description can range from simple timing organization to detailed data-path and instruction structures. As an application example, the system is applied to the comparison of pipelined and parallel micro-architecture organizations for floating-point processing.<<ETX>>

IEEE Transactions on Consumer Electronics | 1995

A VLSI architecture for hierarchical motion estimation

A. Costa; A. de Gloria; P. Faraboschi; F. Passaggio

Motion estimation is the critical path in compression algorithms, and several dedicated hardware solutions have been proposed for its acceleration. In this paper we present an innovative VLSI architecture for motion estimation that combines a low implementation cost with real-time performance for videoconferencing and DTV standards. To minimize computational load and architecture requirements, we adopt a three-step hierarchical search algorithm, that provides a quality comparable with more expensive full-search techniques. The proposed solution focuses on architectural efficiency by employing a minimum set of functional units (three simple processing units, one minimum unit, and four programmable delay lines), still supporting real-time performance for videoconferencing standards. In addition we show how to combine parallel motion estimation units for backward-forward prediction (MPEG) and for higher throughput standards (DTV). >

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 1994

Block placement with a Boltzmann Machine

A. De Gloria; P. Faraboschi; Mauro Olivieri

The Boltzmann Machine is a neural model based on the same principles of simulated annealing that reaches good solutions, reduces the computational requirements, and is well suited for a low-cost, massively parallel hardware implementation. In this paper we present a connectionist approach to the problem of block placement in the plane to minimize wire length, based on its formalization in terms of the Boltzmann Machine. We detail the procedure to build the Boltzmann Machine by formulating the placement problem as a constrained quadratic assignment problem and by defining an equivalent 0-1 programming problem. The key features of the proposed model are: (1) high degree of parallelism in the algorithm, (2) high quality of the results, often near-optimal, and (3) support of a large variety of constraints such as arbitrary block shape, flexible aspect ratio, and rotations/reflections. Experimental results on different problem instances show the skills of the method as an effective alternative to other deterministic and statistical techniques. >

parallel computing | 1992

A dedicated massively parallel architecture for the Boltzman machine

A. De Gloria; P. Faraboschi; S. Ridella

A key task for neural network research is the development of neurocomputers able to speed-up the learning algorithms to allow their application and test in real cases. This paper shows a massive parallel architecture specifically designed to support the Boltzmann machine neural network. n nThe heart of this architecture is its simplicity and reliability together with a low implementation cost. Despite the impressive speedup obtained by accelerating the standard BM algorithm the architecture does not use particular techniques to expose parallelism in the simulating annealing task, such as the change of state of multiple neurons. n nFeatures of the architecture include: (1) speed: the architecture allows a speedup of N (N is the number neurons constituting the BM) with respect to standard implementation on sequential machines; (2) low cost: the architecture requires the same amount of memory of a sequential application, the only additional cost is due to the inclusion of an adder for each neuron; (3) WSI capabilities: the processor interconnection is limited to a single bus for any number of implemented processors, the architecture is scalable in terms of number of processors without any software or hardware modification, the simplicity of the processors enables to implement built-in self-test techniques: (4) High weight dynamics: the architecture performs computation by using 32-bit integer values, therefore offering a wide range of variability of weights.

world congress on computational intelligence | 1994

An optimized RISC instruction set for fuzzy applications

E. Avogadro; S. Commodaro; A. Costa; A. De Gloria; P. Faraboschi; F. Giudici; A. Pagni

Presents a general purpose RISC instruction set with specific support for fuzzy logic. The authors show the design methodology, the processor architecture and its performance with respect to other fuzzy processors. Tests show a remarkable advantage over other realizations. These results have been achieved by means of a particularly suited instruction set and special optimizations in the compilation task.<<ETX>>

parallel computing | 1991

Paper: A boltzmann machine approach to code optimization

A. De Gloria; P. Faraboschi

In this paper we present a statistical approach to the problem of horizontal code compaction for the class of parallel synchronous non-homogeneous architectures. The proposed technique is based on a formulation of the problem using the Boltzmann Machine model. The formalization of the compaction process in terms of a non-deterministic connectionist framework enables to overcome the NP-hardness of the problem and to avoid the introduction of specialized heuristics which necessarily limit the generality of other techniques. Compilations of some Livermore loop kernels are presented, showing how the quality of the code produced with a Boltzmann Machine is comparable with a hand-compacted one. These results prove the skills of the method in implementing an effective alternative to traditional compilation techniques.

IEEE Transactions on Neural Networks | 1993

Efficient implementation of the Boltzmann machine algorithm

A. DeGloria; P. Faraboschi; Mauro Olivieri

The problem of optimizing the sequential algorithm for the Boltzmann machine (BM) is addressed. A solution that is based on the locality properties of the algorithm and makes possible the efficient computation of the cost difference between two configurations is presented. Since the algorithm performance depends on the number of accepted state transitions in the annealing process, a theoretical procedure is formulated to estimate the acceptance probability of a state transition. In addition, experimental data are provided on a well-known optimization problem travelling salesman problem to have a numerical verification of the theory, and to show that the proposed solution obtains a speedup between 3 and 4 in comparison with the traditional algorithm.

Explore More