Vassilios A. Chouliaras
Loughborough University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vassilios A. Chouliaras.
IEEE Transactions on Computers | 2007
Emmanuel Touloupis; James A. Flint; Vassilios A. Chouliaras; David D. Ward
This paper presents a detailed analysis of the behavior of a novel fault-tolerant 32-bit embedded CPU as compared to a default (non-fault-tolerant) implementation of the same processor during a fault injection campaign of single and double faults. The fault-tolerant processor tested is characterized by per-cycle voting of microarchitectural and the flop-based architectural states, redundancy at the pipeline level, and a distributed voting scheme. Its fault-tolerant behavior is characterized for three different workloads from the automotive application domain. The study proposes statistical methods for both the single and dual fault injection campaigns and demonstrates the fault-tolerant capability of both processors in terms of fault latencies, the probability of fault manifestation, and the behavior of latent faults.
IEEE Transactions on Consumer Electronics | 2006
Tr Jacobs; Vassilios A. Chouliaras; David Mulvaney
This study utilizes thread-level parallel techniques to significantly reduce the dynamic instruction count performance metric of the MPEG-2, MPEG-4 and H.264 video encoders. Such solutions are particularly applicable in portable devices as workload distribution among a number of parallel-executing processors decreases the individual processing requirements and allows for the real time video encoding. Due to the use of multiple processing engines in a consumer SoC the required clock frequency for real-time encoding, and hence power consumption, is likely to be considerably less than that of a single high-speed processor solution. The results presented demonstrate that reductions in dynamic instruction count in the range of 84% to 96% can be achieved for each of the encoders investigated.
biomedical engineering and informatics | 2008
Sijung Hu; Jia Zheng; Vassilios A. Chouliaras; Ron Summers
Contact and spot measurement have limited the application of photoplethysmography (PPG), thus an imaging PPG system comprising a digital CMOS camera and three wavelength light-emitting diodes (LEDs) is developed to detect the blood perfusion in tissue. With the means of the imaging PPG system, the ideally contactless monitoring with larger field of view and the different depth of tissue by applying multi- wavelength illumination can be achieved to understand the blood perfusion change. Corresponding to the individual wavelength LED illumination, the PPG signals can be derived in the both transmission and reflection modes, respectively. The outcome explicitly reveals the imaging PPG is able to detect blood perfusion in a illuminated tissue and indicates the vascular distribution and the blood cell response to individual wavelength LED. The functionality investigation leads to the engineering model for 3-D visualized blood perfusion of tissue and the development of imaging PPG tomography.
international conference on consumer electronics | 2006
Y.L. Nunez-Yanez; Vassilios A. Chouliaras; D. Alfonso; F.S. Rovati
This paper investigates the algorithmic complexity of rate distortion optimization and arithmetic coding in the new H.264 video coding standard and proposes a hardware accelerator to reduce it by more than an order of magnitude. The accelerator incorporates arithmetic coding and decoding engines and efficiently handles all the context information required by RDO and CABAC in H.264. The bit stream generated by the hardware is equivalent to that generated by the JM 9.4 reference implementation. The ISA of a controlling scalar 32-bit RISC CPU has been extended with custom RDO/CABAC instructions and the accelerator prototyped in a state-of-the-art FPGA technology.
international conference on consumer electronics | 2005
Vassilios A. Chouliaras; J. L. Nunez; David Mulvaney; F.S. Rovati; D. Alfonso
A multi-standard video encoding coprocessor is presented that efficiently accelerates MPEG-2, MPEG-4 (XViD) and a proprietary H.264 encoder. The proposed architecture attaches to a configurable, extensible RISC CPU to form a highly efficient solution to the computational complexity of current and emerging video coding standards. A subset of the ISA has been implemented as a VLSI macrocell for a high performance 0.13 /spl mu/m silicon process.
field-programmable logic and applications | 2008
Jose Luis Nunez-Yanez; Eddie Hung; Vassilios A. Chouliaras
This work presents a programmable, configurable motion estimation processor for the H.264 video coding standard, capable of handling the processing requirements of high definition (HD) video and suitable for FPGA implementation. The programmable aspect of the processor follows the ASIP (application specific instruction set processor) approach with a instruction set targeted to accelerating block matching motion estimation algorithms. Configurability relates to the ability to optimize the microarchitecture for the selected algorithm and performance requirements through varying the number and type of execution units at compile time.
field-programmable logic and applications | 2007
Jose Luis Nunez-Yanez; Vassilios A. Chouliaras; Jiri Gaisler
This paper presents a DVS (Dynamic Voltage Scaling) enabled SoC (System-on-Chip) processing platform based on the Leon3 open-source processor and dynamically reconfigurable clock synthesis technology available in Virtex-4 Xilinx FPGAs. A special DVS monitor unit maintains correct operation of the processor core at a given voltage by tracking the behavior of an internal delay line and stopping the processor clock through a digital clock management (DCM) macroblock when a timing error is about to occur. Upon detection of a new valid working point the DVS monitor unit reconfigures the main DCM to synthesize a new frequency-adjusted CPU clock signal and reactivates the processor. The energy savings and operation range of the technology are evaluated in the context of video coding applications by executing different motion estimation kernels.
IEEE Transactions on Very Large Scale Integration Systems | 2008
Xiaofeng Wu; Vassilios A. Chouliaras; Jose Luis Nunez-Yanez; Roger M. Goodall
This paper describes a novel control system processor architecture based on DeltaSigma modulation known as the DeltaSigma -CSP. The DeltaSigma -CSP utilizes 1-bit processing which is a new concept in digital control applications with the direct benefit of making multi-bit multiplication operations redundant. A simple conditional-negate-and-add (CNA) unit is instead used for operations in control law implementations. For this reason, the proposed processor has a very small silicon footprint and runs at very high frequencies making it ideal for high-sampling rate, real-time control applications. A number of DeltaSigma -CSP configurations have been implemented as VLSI hard macros in a high-performance 0.13-mum CMOS process and a particular configuration achieved a post-route operating frequency of 355 MHz resulting in a 2.17 MHz sampling rate for a fourth-order control law implementation. Additional results prove that the DeltaSigma -CSP compares very favorably, in terms of silicon area and sampling rates, to two other specialized digital control processing systems, including direct, hardwired implementation of control laws; at the same time, it substantially outperforms software implementations of control laws running on very wide, general-purpose VLIW architectures.
Neurocomputing | 2007
Zhenhuan Zhu; David Mulvaney; Vassilios A. Chouliaras
This paper presents a novel genetic algorithm, termed the Optimum Individual Monogenetic Algorithm (OIMGA) and describes its hardware implementation. As the monogenetic strategy retains only the optimum individual, the memory requirement is dramatically reduced and no crossover circuitry is needed, thereby ensuring the requisite silicon area is kept to a minimum. Consequently, depending on application requirements, OIMGA allows the investigation of solutions that warrant either larger GA populations or individuals of greater length. The results given in this paper demonstrate that both the performance of OIMGA and its convergence time are superior to those of existing hardware GA implementations. Local convergence is achieved in OIMGA by retaining elite individuals, while population diversity is ensured by continually searching for the best individuals in fresh regions of the search space. Keywords—Genetic algorithms, hardware-based machine learning.
application-specific systems, architectures, and processors | 2005
Jose Luis Nunez-Yanez; Vassilios A. Chouliaras
A high performance and silicon efficient hardware architecture for binary arithmetic coding (BAC) acceleration is presented and its application to entropy coding in the context of the H.264 video compressor standard described. The proposed hardware architecture remains bit compatible with the software implementation used in the H.264 ITU standard. The renormalization sequence that maintains the state variables in the appropriate range has been rewritten in order to enable a data independent throughput in hardware of 1 symbol per clock cycle. The instruction set extensions required to be implemented as part of the ISA of a controlling RISC are proposed. Finally, ASIC and FPGA implementations are obtained and the performance and complexity compared with recent implementations of the well-known MQ-coder reported.