Panagiotis D. Vouzis | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Panagiotis D. Vouzis is active.

Explore More

Publication

Featured researches published by Panagiotis D. Vouzis.

Bioinformatics | 2011

GPU-BLAST

Panagiotis D. Vouzis; Nikolaos V. Sahinidis

Motivation: The Basic Local Alignment Search Tool (BLAST) is one of the most widely used bioinformatics tools. The widespread impact of BLAST is reflected in over 53 000 citations that this software has received in the past two decades, and the use of the word ‘blast’ as a verb referring to biological sequence comparison. Any improvement in the execution speed of BLAST would be of great importance in the practice of bioinformatics, and facilitate coping with ever increasing sizes of biomolecular databases. Results: Using a general-purpose graphics processing unit (GPU), we have developed GPU-BLAST, an accelerated version of the popular NCBI-BLAST. The implementation is based on the source code of NCBI-BLAST, thus maintaining the same input and output interface while producing identical results. In comparison to the sequential NCBI-BLAST, the speedups achieved by GPU-BLAST range mostly between 3 and 4. Availability: The source code of GPU-BLAST is freely available at http://archimedes.cheme.cmu.edu/biosoftware.html. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

IEEE Transactions on Control Systems and Technology | 2009

A System-on-a-Chip Implementation for Embedded Real-Time Model Predictive Control

Panagiotis D. Vouzis; Mayuresh V. Kothare; Leonidas Bleris; Mark G. Arnold

This paper presents a hardware architecture for embedded real-time model predictive control (MPC). The computational cost of an MPC problem, which relies on the solution of an optimization problem at every time step, is dominated by operations on real matrices. In order to design an efficient and low-cost application-specific processor, we analyze the computational cost of MPC, and we propose a limited-resource host processor to be connected with an application-specific matrix coprocessor. The coprocessor uses a 16-b logarithmic number system arithmetic unit, which is designed using cotransformation, to carry out the required arithmetic operations. The proposed architecture is implemented by means of a hardware description language and then prototyped and emulated on a field-programmable gate array. Results on computation time and architecture area are presented and analyzed, and the functionality of the proposed architecture is verified using two case studies: a linear problem of a rotating antenna and a nonlinear glucose-regulation problem. The proposed MPC architecture yields a small-in-size and energy-efficient implementation that is capable of solving the aforementioned problems on the order of milliseconds, and we compare its performance and area requirements with other MPC designs that have appeared in the literature.

american control conference | 2006

A co-processor FPGA platform for the implementation of real-time model predictive control

Leonidas Bleris; Panagiotis D. Vouzis; Mark G. Arnold; Mayuresh V. Kothare

In order to effectively control nonlinear and multivariable models, and to incorporate constraints on system states, inputs and outputs (bounds, rate of change), a suitable (sometimes necessary) controller is model predictive control (MPC). MPC is an optimization-based control scheme that requires abundant matrix operations for the calculation of the optimal control moves. In this work we propose a mixed software and hardware embedded MPC implementation. Using a codesign step and based on profiling results, we decompose the optimization algorithm into two parts: one that fits into a host processor and one that fits into a custom made unit that performs the computationally demanding arithmetic operations. The profiling results and information on the co-processor design are provided

parallel computing | 2010

GPU computing with Kaczmarz's and other iterative algorithms for linear systems

Joseph M. Elble; Nikolaos V. Sahinidis; Panagiotis D. Vouzis

The graphics processing unit (GPU) is used to solve large linear systems derived from partial differential equations. The differential equations studied are strongly convection-dominated, of various sizes, and common to many fields, including computational fluid dynamics, heat transfer, and structural mechanics. The paper presents comparisons between GPU and CPU implementations of several well-known iterative methods, including Kaczmarzs, Cimminos, component averaging, conjugate gradient normal residual (CGNR), symmetric successive overrelaxation-preconditioned conjugate gradient, and conjugate-gradient-accelerated component-averaged row projections (CARP-CG). Computations are preformed with dense as well as general banded systems. The results demonstrate that our GPU implementation outperforms CPU implementations of these algorithms, as well as previously studied parallel implementations on Linux clusters and shared memory systems. While the CGNR method had begun to fall out of favor for solving such problems, for the problems studied in this paper, the CGNR method implemented on the GPU performed better than the other methods, including a cluster implementation of the CARP-CG method.

Computers & Chemical Engineering | 2011

GPU simulations for risk assessment in CO2 geologic sequestration

Yan Zhang; Panagiotis D. Vouzis; Nikolaos V. Sahinidis

A main concern for any CO{sub 2} sequestration system is whether it may leak CO{sub 2} over a long-term time horizon. The outcome depends on the competition between sequestration and leakage processes. Leakages may occur from failure of manmade material or through faults in the formations above the reservoir. A simple and computationally efficient simulator was constructed based on the CQUESTRA model (LeNeveu, 2008). To assess the risk associated with uncertainty in the values of uncertain parameters in this model, thousands of runs were carried out with the simulator on a general-purpose graphics processing unit (GPU). The GPU implementation was up to 64 times faster compared to a CPU implementation. In the absence of active faults around a single injection well, the model suggests that leakages of more than 1% of the total CO{sub 2} are unlikely during the 1000 year period after dissipation of temperature and pressure transients associated with injection. Leakage amounts for ten leaky wells are considerably higher, suggesting the critical importance of monitoring equipment after sequestration.

international workshop on computer architecture for machine perception | 2007

A Scalable Design for Signal Conditioning and Digitization in Implantable Multi-Channel Neural Sensors

Panagiotis D. Vouzis; Sylvain Collange; Mark G. Arnold

The logarithmic number system (LNS) makes multiplication, division and powering easy, but subtraction is expensive. Cotransformation converts the difficult operation of logarithmic subtraction into the easier operation of logarithmic addition. In this paper, a new variant of cotransformation is proposed, which is simpler to design and more economical in hardware than previous cotransformation methods. The novel method commutes operands differently for addition than for subtraction. Simulation results show how many guard bits are required by the new cotransformation to guarantee faithful rounding and that, even without guard bits, cotransformation produces an LNS unit more accurate than a previously published hardware-description-language (HDL) library for LNS arithmetic that uses only multipartite tables or 2nd-order interpolation.

digital systems design | 2007

Cotransformation Provides Area and Accuracy Improvement in an HDL Library for LNS Subtraction

Panagiotis D. Vouzis; Sylvain Collange; Mark G. Arnold

The reduction of the cumbersome operations of multiplication, division, and powering to addition, subtraction and multiplication is what makes the Logarithmic Number System (LNS) attractive. Addition and subtraction, though, are the bottleneck of every LNS circuit, for which there are implementation techniques that tradeoff area, latency and accuracy. This paper reviews the methods of interpolation, multipartite tables and cotransformation for LNS addition and subtraction, but special focus is given on a novel version of cotransformation, for which a new special case is identified. Synthesis results compare an already published Hardware Description Language (HDL) library for LNS arithmetic that uses only multipartite tables or 2nd-order interpolation against a variation of the same library combined with cotransformation. Exhaustive simulation and a graphics example illustrate that the proposed library has smaller area requirements and is more accurate than the earlier library, at the cost of an increase in the latency of the hardware.

signal processing systems | 2010

A Novel Cotransformation for LNS Subtraction

Panagiotis D. Vouzis; Sylvain Collange; Mark G. Arnold

The Logarithmic Number System (LNS) can be considered a simplification of the Floating Point (FP) Number System that assumes the mantissa is always equal to one, and has a binary fixed-point exponent. LNS converts multiplication/division to a single addition/subtraction, which make LNS a very attractive choice for applications where these operations predominate, such as in some signal-processing algorithms. However, for wordlengths greater than 20 bits LNS becomes expensive because of the hardware-demanding LNS operations of addition and subtraction, which are typically important for most signal-processing algorithms. This paper gives an overview of the family of LNS subtraction algorithms called “Cotransformations,” and proposes a “Novel Cotransformation Combination” that offers improvements in terms of area and speed without sacrificing accuracy compared to previous methods. The hardware requirements of the proposed method are analyzed mathematically, and the results are verified by using synthesis and simulations.

digital systems design | 2007

A Serial Logarithmic Number System ALU

Mark G. Arnold; Panagiotis D. Vouzis

Serial arithmetic uses less hardware than parallel arithmetic. Serial floating point (FP) is slower than parallel FP. The Logarithmic Number System (LNS) simplifies operations, but a fast serial implementation of LNS has never been proposed previously. This paper presents a fast bit-serial LNS that combines a novel serial implementation of Mitchells method and a new error correction method that is compatible with least-significant-bit-first serial arithmetic.

international symposium on industrial electronics | 2006

A Custom-made Algorithm-Specific Processor for Model Predictive Control

Panagiotis D. Vouzis; Leonidas Bleris; Mark G. Arnold; Mayuresh V. Kothare

This paper presents an algorithm-specific processor for embedded model predictive control (MPC). After analyzing the computational cost of MPC, via profiling, we observe that the optimizations associated with MPC are dominated by operations on real matrices. To overcome this bottleneck we propose connecting a limited resource host processor with an algorithm-specific matrix processor, whose architecture is described. The matrix processor uses a 16-bit logarithmic number system (LNS) arithmetic unit to carry out the required arithmetic operations. The proposed architecture is implemented using a hardware description language (HDL) and subsequently it is synthesized and emulated on a field programmable gate array (FPGA). The timing and area cost results are presented and analyzed

Explore More