Theocharis Theocharides

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Theocharis Theocharides is active.

Explore More

Publication

Featured researches published by Theocharis Theocharides.

IEEE Transactions on Very Large Scale Integration Systems | 2011

A Flexible Parallel Hardware Architecture for AdaBoost-Based Real-Time Object Detection

Christos Kyrkou; Theocharis Theocharides

Real-time object detection is becoming necessary for a wide number of applications related to computer vision and image processing, security, bioinformatics, and several other areas. Existing software implementations of object detection algorithms are constrained in small-sized images and rely on favorable conditions in the image frame to achieve real-time detection frame rates. Efforts to design hardware architectures have yielded encouraging results, yet are mostly directed towards a single application, targeting specific operating environments. Consequently, there is a need for hardware architectures capable of detecting several objects in large image frames, and which can be used under several object detection scenarios. In this work, we present a generic, flexible parallel architecture, which is suitable for all ranges of object detection applications and image sizes. The architecture implements the AdaBoost-based detection algorithm, which is considered one of the most efficient object detection algorithms. Through both field-programmable gate array emulation and large-scale implementation, and register transfer level synthesis and simulation, we illustrate that the architecture can detect objects in large images (up to 1024 × 768 pixels) with frame rates that can vary between 64-139 fps for various applications and input image frame sizes.

design, automation, and test in europe | 2010

Towards hardware stereoscopic 3D reconstruction: a real-time FPGA computation of the disparity map

Stavros Hadjitheophanous; Christos Ttofis; Athinodoros S. Georghiades; Theocharis Theocharides

Stereoscopic 3D reconstruction is an important algorithm in the field of Computer Vision, with a variety of applications in embedded and real-time systems. Existing software-based implementations cannot satisfy the performance requirements for such constrained systems; hence an embedded hardware mechanism might be more suitable. In this paper, we present an architecture of a 3D reconstruction system for stereoscopic images, which we implement on Virtex2 Pro FPGA. The architecture uses a Sobel edge detector to achieve real-time (75 fps) performance, and is configurable in terms of various application parameters, making it suitable for a number of application environments. The paper also presents a design exploration on algorithmic parameters such as disparity range, correlation window size, and input image size, illustrating the impact on the performance for each parameter.

IEEE Transactions on Computers | 2012

A Parallel Hardware Architecture for Real-Time Object Detection with Support Vector Machines

Christos Kyrkou; Theocharis Theocharides

Object detection applications are often associated with real-time performance constraints that stem from the embedded environment that they are often deployed in. Consequently, researchers have proposed dedicated hardware architectures, utilizing a variety of classification algorithms targeting object detection. Support Vector Machines (SVMs) is among the most popular classification algorithms used in object detection yielding high accuracy rates. However, existing SVM hardware implementations attempting to speed up SVM classification, have either targeted only simple applications, or SVM training. As such, there are limited proposed hardware architectures that are generic enough to be used in a variety of object detection applications. Hence, this paper presents a parallel array architecture for SVM-based object detection, in an attempt to show the advantages, and performance benefits that stem from a dedicated hardware solution. The proposed hardware architecture provides parallel processing, resource sharing among the processing units, and efficient memory management. Furthermore, the size of the array is scalable to the hardware demands, and can also handle a variety of applications such as multiclass classification problems. A prototype of the proposed architecture was implemented on an FPGA platform and evaluated using three popular detection applications, demonstrating real-time performance (40-122 fps for a variety of applications).

IEEE Transactions on Computers | 2013

Edge-Directed Hardware Architecture for Real-Time Disparity Map Computation

Christos Ttofis; Stavros Hadjitheophanous; Athinodoros S. Georghiades; Theocharis Theocharides

Stereo Vision, a technique aimed at inferring depth information from stereo images, has been used in a wide range of computer vision applications, with real-time requirements in emerging embedded vision systems. Computation of the disparity map, a vital step in extracting depth information from stereo images, requires a significant amount of computational resources. As such, existing software implementations require high-end hardware platforms to achieve real-time frame rates, suggesting that dedicated hardware mechanisms might be more suitable for embedded applications. In this paper, we present a disparity map computation architecture targeting embedded stereo vision applications with hard real-time requirements. The architecture integrates a hardware edge detection mechanism that reduces the search space, improving the overall performance, and is configurable in terms of various application parameters, making it suitable for a number of application environments. The paper also presents a study on the impact of the various parameters in terms of the performance and hardware/power overheads. An experimental prototype of the architecture was implemented on the Xilinx ML505 FPGA Evaluation Platform, achieving 50 Frames Per Second (fps) for 1,280 × 1,024 image sizes. Moreover, the quality of the disparity maps generated by the proposed system is comparable to other existing hardware implementations featuring local stereo correspondence methods.

design, automation, and test in europe | 2012

Towards accurate hardware stereo correspondence: a real-time FPGA implementation of a segmentation-based adaptive support weight algorithm

Christos Ttofis; Theocharis Theocharides

Disparity estimation in stereoscopic vision is a vital step for the extraction of depth information from stereo images. This paper presents the hardware implementation of a disparity estimation system that enables good performance in both accuracy and speed. The architecture implements an adaptive support weight stereo correspondence algorithm, which integrates information obtained from image segmentation, in an attempt to increase the robustness of the matching process. The proposed system integrates optimization techniques that make the algorithm hardware-friendly and suitable for embedded vision systems. A prototype of the architecture was implemented on an FPGA, achieving 30 fps for 640×480 image sizes. The quality of the disparity maps generated by the proposed system is also better than other existing hardware implementations featuring fixed support local correspondence methods.

IEEE Embedded Systems Letters | 2009

SCoPE: Towards a Systolic Array for SVM Object Detection

Christos Kyrkou; Theocharis Theocharides

This paper presents SCoPE (systolic chain of processing elements), a first step towards the realization of a generic systolic array for support vector machine (SVM) object classification in embedded image and video applications. SCoPE provides efficient memory management, reduced complexity, and efficient data transfer mechanisms. The proposed architecture is generic and scalable, as the size of the chain, and the kernel module can be changed in a plug and play approach without affecting the overall system architecture. These advantages provide versatility, scalability and reduced complexity that make it ideal for embedded applications. Furthermore, the SCoPE architecture is intended to be used as a building block towards larger systolic systems for multi-input or multi-class classification. Simulation results indicate real-time performance, achieving face detection at ~33 frames per second on an FPGA prototype.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2012

Intelligent Hotspot Prediction for Network-on-Chip-Based Multicore Systems

Elena Kakoulli; Vassos Soteriou; Theocharis Theocharides

Hotspots are network-on-chip (NoC) routers or modules in multicore systems which occasionally receive packetized data from other networked element producers at a rate higher than they can consume it. This adverse phenomenon may greatly reduce the performance of NoCs, especially when wormhole flow-control is employed, as backpressure can cause the buffers of neighboring routers to quickly fill-up leading to a spatial spread in congestion. This can cause the network to saturate prematurely where in the worst scenario the NoC may be rendered unrecoverable. Thus, a hotspot prevention mechanism can be greatly beneficial, as it can potentially enable the interconnection system to adjust its behavior and prevent the rise of potential hotspots, subsequently sustaining NoC performance. The inherent unevenness of traffic patterns in an NoC-based general-purpose multicore system such as a chip multiprocessor, due to the diverse and unpredictable access patterns of applications, produces unexpected hotspots whose appearance cannot be known a priori, as application demands are not predetermined, making hotspot prediction and subsequently prevention difficult. In this paper, we present an artificial neural network-based (ANN) hotspot prediction mechanism that can be potentially used in tandem with a hotspot avoidance or congestion-control mechanism to handle unforeseen hotspot formations efficiently. The ANN uses online statistical data to dynamically monitor the interconnect fabric, and reactively predicts the location of an about to-be-formed hotspot(s), allowing enough time for the multicore system to react to these potential hotspots. Evaluation results indicate that a relatively lightweight ANN-based predictor can forecast hotspot formation(s) with an accuracy ranging from 65% to 92%.

symposium on cloud computing | 2004

ChipPower: an architecture-level leakage simulator

Yuh-Fang Tsai; Ananth Hegde Ankadi; Narayanan Vijaykrishnan; Mary Jane Irwin; Theocharis Theocharides

Leakage power is projected to be one of the major challenges in future technology generations. The temperature profile, process variation, and transistor count all have strong impact on the leakage power distribution of a processor. We have built a simulator to estimate the dynamic/leakage power for a VLIW architecture considering dynamic temperature feedback and process variation. The framework is based on architecture similar to the Intel Itanium IA64 and is extended to simulate its power when implemented in 65nm technology. Our experimental results show that leakage power will become more than 50% of the power budget in 65nm technology. Moreover, without including the process variation, the total leakage power will be underestimated by as much as 30%.

IEEE Transactions on Neural Networks | 2016

Embedded Hardware-Efficient Real-Time Classification With Cascade Support Vector Machines

Christos Kyrkou; Christos-Savvas Bouganis; Theocharis Theocharides; Marios M. Polycarpou

Cascade support vector machines (SVMs) are optimized to efficiently handle problems, where the majority of the data belong to one of the two classes, such as image object classification, and hence can provide speedups over monolithic (single) SVM classifiers. However, SVM classification is a computationally demanding task and existing hardware architectures for SVMs only consider monolithic classifiers. This paper proposes the acceleration of cascade SVMs through a hybrid processing hardware architecture optimized for the cascade SVM classification flow, accompanied by a method to reduce the required hardware resources for its implementation, and a method to improve the classification speed utilizing cascade information to further discard data samples. The proposed SVM cascade architecture is implemented on a Spartan-6 field-programmable gate array (FPGA) platform and evaluated for object detection on 800 × 600 (Super Video Graphics Array) resolution images. The proposed architecture, boosted by a neural network that processes cascade information, achieves a real-time processing rate of 40 frames/s for the benchmark face detection application. Furthermore, the hardware-reduction method results in the utilization of 25% less FPGA custom-logic resources and 20% peak power reduction compared with a baseline implementation.

IEEE Transactions on Computers | 2016

A Low-Cost Real-Time Embedded Stereo Vision System for Accurate Disparity Estimation Based on Guided Image Filtering

Christos Ttofis; Christos Kyrkou; Theocharis Theocharides

Stereo matching, a key element towards extracting depth information from stereo images, is widely used in several embedded consumer electronic and multimedia systems. Such systems demand high processing performance and accurate depth perception, while their deployment in embedded and mobile environments implies that cost, energy and memory overheads need to be minimized. Hardware acceleration has been demonstrated in efficient embedded stereo vision systems. To this end, this paper presents the design and implementation of a hardware-based stereo matching system able to provide high accuracy and concurrently high performance for embedded vision devices, which are associated with limited hardware and power budget. We first implemented a compact and efficient design of the guided image filter, an edge-preserving filter, which reduces the hardware complexity of the implemented stereo algorithm, while at the same time maintains high-quality results. The guided filter design is used in two parts of the stereo matching pipeline, showing that it can simplify the hardware complexity of the Adaptive Support Weight aggregation step, and efficiently enable a powerful disparity refinement unit, which improves matching accuracy, even though cost aggregation is based on simple, fixed support strategies. We implemented several variants of our design on a Kintex-7 FPGA board, which was able to process HD video (1,280 × 720) in real-time (60 fps), using ~57.5k and ~71k of the FPGAs logic (CLB) and register resources, respectively. Additionally, the proposed stereo matching design delivers leading accuracy when compared to state-of-the-art hardware implementations based on the Middlebury evaluation metrics (at least 1.5 percent less bad matching pixels).

Explore More