Minas Dasygenis
University of Western Macedonia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Minas Dasygenis.
power and timing modeling optimization and simulation | 2000
Dimitrios Soudris; Nikolaos D. Zervas; Antonios Argyriou; Minas Dasygenis; Konstantinos Tatas; Constantinos E. Goutis; Adonios Thanailakis
Exploitation of data re-use in combination with the use of custom memory hierarchy that exploits the temporal locality of data accesses may introduce significant power savings, especially for data-intensive applications. The effect of the data-reuse decisions on the power dissipation but also on area and performance of multimedia applications realized on multiple embedded cores is explored. The interaction between the data-reuse decisions and the selection of a certain data-memory architecture model is also studied. As demonstrator a widely-used video processing algorithmic kernel, namely the full search motion estimation kernel, is used. Experimental results prove that improvements in both power and performance can be acquired, when the right combination of data memory architecture model and data-reuse transformation is selected.
IEEE Transactions on Very Large Scale Integration Systems | 2006
Minas Dasygenis; Erik Brockmeyer; Bart Durinck; Francky Catthoor; Dimitrios Soudris; A. Thanailakis
Memory latency has always been a major issue in embedded systems that execute memory-intensive applications. This is even more true as the gap between processor and memory speed continues to grow. Hardware and software prefetching have been shown to be effective in tolerating the large memory latencies inherit in large off-chip memories; however, both types of prefetching have their shortcomings. Hardware schemes are more complex and require extra circuitry to compute data access strides, while software schemes generate prefetch instructions, which if not computed carefully may hamper performance. On the other hand, some applications domains (such as multimedia) have a uniform and known a priori memory access pattern, that if exploited, could yield significant application performance improvement. With this characteristic in mind, we present our findings on hiding memory latency using the direct memory access (DMA) mode, which is present in all modern systems, combined with a software prefetch mechanism, and a customized on-chip memory hierarchy mapping. Compared to previous approaches, we are able to estimate the performance and power metrics, without actually implementing the embedded system. Experimental results on nine well known multimedia and imaging applications prove the efficiency of our technique. Finally, we verify the performance estimations by implementing and simulating the algorithms on the TI C6201 processor.
international conference on design and technology of integrated systems in nanoscale era | 2014
Minas Dasygenis
Design space exploration of new circuit methodologies require the creation of models in hardware description languages to evaluate the characteristics for different parameters, a time consuming process. To alleviate the burden of HDL construction, we present a compact netlist format and a web tool that creates syntactically correct VHDL files. The designer can use our tool, together with an easy to create netlist generator, to quickly create multiple VHDL files and skeleton test benches, to evaluate his model. Our parametrized netlist generators illustrate the efficiency of our EDA tool.
design, automation, and test in europe | 2005
Minas Dasygenis; Erik Brockmeyer; Bart Durinck; Francky Catthoor; Dimitrios Soudris; A. Thanailakis
The memory subsystem has always been a bottleneck in performance as well as significant power contributor in memory intensive applications. Many researchers have presented multi-layered memory hierarchies as a means to design energy and performance efficient systems. However, most of the previous work does not explore trade-offs systematically. We fill this gap by proposing a formalized technique that takes into consideration data reuse, limited life-time of the arrays of an application and application specific prefetching opportunities, and performs a thorough tradeoff exploration for different memory layer sizes. This technique has been implemented on a prototype tool, which was tested successfully using nine real-life applications of industrial relevance. Following this approach we have able to reduce execution time up to 60%, and energy consumption up to 70%.
IEEE Transactions on Circuits and Systems | 2008
Minas Dasygenis; K. Mitroglou; Dimitrios Soudris; Adonios Thanailakis
Over the last three decades, there has been considerable interest in the implementation of digital computer elements using hardware based on the residue number system, (RNS) due to the carry free addition and other beneficial characteristics of this system. Scaling operation is one of the essential operations in this number system, and is required for almost every digital signal processing application. Up to now, researchers have suggested costly and low throughput read-only memoy-based approaches to address this need. We also address this need by presenting a novel graph-based methodology for designing high-throughput and low-cost VLSI RNS scaling architectures, based completely on full adders (FAs). Our formalized methodology consists of a number of steps, which specify the minimum number of FAs for performing the scaling operation as well as the interconnections among the FAs. We present our formalized methodology together with a running example to aid in comprehension. Negative residue numbers are covered as well, requiring no additional effort. Finally, we have developed a design support tool that can provide structural VHDL descriptions of our RNS scalers, which can be synthesized in VLSI tools.
international conference on tools with artificial intelligence | 2014
Minas Dasygenis; Kostas Stergiou
Portfolio based approaches to constraint solving aim at exploiting the variability in performance displayed by different solvers or different parameter settings of a single solver. Such approaches have been quite successful in both a sequential and a parallel processing mode. Given the increasingly larger number of available processors for parallel processing, an important challenge when designing portfolios is to identify solver parameters that offer diversity in the exploration of the search space and to generate different solver configurations by automatically tuning these parameters. In this paper we propose, for the first time, a way to build porfolios for parallel solving by parameter zing the local consistency property applied during search. To achieve this we exploit heuristics for adaptive propagation proposed in stergiou08. We show how this approach can result in the easy automatic generation of portfolios that display large performance variability. We make an experimental comparison against a standard sequential solver as well as portfolio based methods that use randomization of the variable ordering heuristic as the source of diversity. Results demonstrate that our method constantly outperforms the sequential solver and in most cases it is more efficient than the other portfolio approaches.
international symposium on circuits and systems | 2005
Nikolaos Kroupis; Minas Dasygenis; Kleoniki Markou; Dimitrios Soudris; A. Thanailakis
One of the growing areas in the embedded community is for multimedia devices. Multimedia devices incorporate a number of complicated functions for their operation, like motion estimation. A multitude of different implementations have been proposed to reduce motion estimation complexity, such as spiral search. We have studied the implementations of spiral search and identified areas of improvement. We propose a modified spiral search motion estimation algorithm, with lower computational complexity compared to the original spiral search. We have implemented our algorithm on an embedded ARM based architecture, with custom memory hierarchy. The resulting system yields lower energy consumption and higher performance, with some penalty in image quality, compared with the original spiral search algorithm.
power and timing modeling optimization and simulation | 2014
Giannis Petrousov; Minas Dasygenis
To satisfy the low time to market period, modern digital circuits demand a rapid prototype design exploration, which can be achieved using space exploration tools that given a set of input constrains, generate the HDL definitions that implement the given functionality. Residue number system (RNS), a non-conventional arithmetic system, has been proposed as a viable alternative for hardware acceleration, due to its carry free nature. Here, we present the first web accessible EDA tool that can generate custom synthesizable forward residue number binary-to-RNS converters, for a specific input bit width and a moduli base, which can be optimally selected by our tool. Our novel tool is the first one to automate the design of forward residue number system converters and simultaneously provide custom test benches to verify their correctness. The tool is available for the public, from our web server. Our synthesized circuits on Xilinx Virtex 6 FPGA XC6VLX760, operate up to 783 Mhz.
Real-time Imaging | 2003
Konstantinos Tatas; Minas Dasygenis; N. Kroupis; Antonios Argyriou; Dimitrios Soudris; A. Thanailakis
A memory power optimization and performance exploration methodology based on high-level (C language) code transformations that allows the system designer to explore various data memory power, data memory area and performance trade-offs early in the design process of embedded multimedia systems is introduced. This exploration strategy is introduced for both single and multiprocessor environments. The latter requires partitioning of the application. After employing software transformations, the experimental results, obtained using four well-known motion estimation kernels provide an insight on the performance and energy consumption trade-offs, comparing memory hierarchies for the ARM programmable core and prove the validity of the proposed approach.
international conference on electronics, circuits, and systems | 2002
Dimitrios Soudris; Minas Dasygenis; K. Mitroglou; Konstantinos Tatas; A. Thanailakis
A systematic methodology for designing full-adder-based architectures in residue number system for scaling operation and its software tool development, are introduced. Starting from the mathematical description of scaling operation in RNS, we end up with the VHDL description of a full-adder based architecture. The proposed tool was implemented in C++ language and it is available for PC and HP platforms. The derived architectures are characterized by smaller hardware complexity and higher throughput rates than existing ones.