Miljan Vuletic
École Polytechnique Fédérale de Lausanne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Miljan Vuletic.
design automation conference | 2004
Miljan Vuletic; Laura Pozzi; Paolo Ienne
The complexity of hardware/software (HW/SW) interfacing and the lack of portability across different platforms, restrain the widespread use of reconfigurable accelerators and limit the designer productivity. Furthermore, communication between SW and HW parts of codesigned applications are typically exposed to SW programmers and HW designers. In this work, we introduce a virtualization layer that allows reconfigurable application-specific coprocessors to access the user-space virtual memory and share the memory address space with user applications. The layer, consisting of an operating system (OS) extension and a HW component, shifts the burden of moving data between processor and coprocessor from the programmer to the OS, lowers the complexity of interfacing, and hides physical details of the system. Not only does the virtualization layer enhance programming abstraction and portability, but it also performs runtime optimizations: by predicting future memory accesses and speculatively prefetching data, the virtualization layer improves the coprocessor execution-applications achieve better performance without any user intervention. We use two different reconfigurable system-on-chip (SoC) running Linux and codesigned applications to prove the viability of our concept. The applications run faster than their SW versions, and the overhead due to the virtualisation is limited. Dynamic prefetching in the virtualisation layer further reduces the abstraction overhead
design, automation, and test in europe | 2002
Laura Pozzi; Miljan Vuletic; Paolo Ienne
The need for high performance in ASIC embedded processors, coupled with aggressive energy and area goals, is pushing researchers and designers toward processor specialisation for a given application-domain. In this paper, specialisation is addressed through introduction of Ad-hoc Functional Units special arithmetic/logic units added to a traditional architecture to perform domain-specific complex operations.
design, automation, and test in europe | 2004
Miljan Vuletic; Ludovic Righetti; Laura Pozzi; Paolo Ienne
Reconfigurable systems-on-chip (SoC) consist of large field programmable gate arrays (FPGAs) and standard processors. The reconfigurable logic can be used for application-specific coprocessors to speedup execution of applications. The widespread use is limited by the complexity of interfacing software applications with coprocessors. We present a virtualization layer that lowers the interfacing complexity and improves the portability. The layer shifts the burden of moving data between processor and coprocessor from the programmer to the operating system (OS). A reconfigurable SoC running Linux is used to prove the concept.
application-specific systems, architectures, and processors | 2004
Miljan Vuletic; Laura Pozzi; Paolo Ienne
Despite enabling significant performance improvements, reconfigurable computing systems have not gained widespread acceptance: most reconfigurable computing paradigms lack (1) a unified and transparent programming model, and (2) a standard interface for integration of hardware accelerators. Ideally, programmers should code algorithms and designers should write hardware accelerators independently of any detail of the underlying platform. We argue that achieving portability and uniform programming with only limited loss of performance is one of the main issues that hinder the widespread acceptance of reconfigurable computing. To make reconfigurable computing globally more attractive, we suggest a transparent, portable, and hardware agnostic programming paradigm. For achieving software code and hardware design portability, platform-specific tasks are delegated to a system-level virtualisation layer that supports a chosen programming model-much in the same way platform details are hidden from users in general-purpose computers. Although an additional abstraction inherently brings overheads, we show that the involvement of the virtualisation layer exposes potential optimisations that compensate the overheads and bring additional speedups. As a case-study, we present a real design and implementation of a number of building blocks of such system and discuss the challenges involved in materialising the others.
field-programmable technology | 2006
Miljan Vuletic; Paolo Ienne; Christopher Claus; Walter Stechele
Although naturally belonging to the user process, hardware parts of codesigned reconfigurable applications execute outside of the operating system (OS) process: they have neither unified memory abstraction with software nor system services provided by the OS. This imposes limitations on hardware and software interfacing, narrows available programming paradigms, and affects application portability. Advanced programming concepts, such as multithreading, usually demand additional activities on the programmer side, to perform memory transfers and enforce memory consistency. In this paper, we introduce a system layer (an OS extension relying on a system hardware extension) that provides: (1) unified virtual memory, (2) platform-agnostic interfacing, and (3) multithreaded execution, for hardware accelerators running within the same OS process with user software. The system layer releases software programmer and hardware designer from interfacing burdens and, still, achieves significant speedups over software with only limited overheads. Virtual-memory-enabled hardware accelerators benefit from all abstractions and services already available to software. To prove our concept in practice and demonstrate the ease of programming, we execute image processing and cryptography applications on reconfigurable systems-on-chip running GNU/Linux that supports virtual memory for multithreaded hardware accelerators
field-programmable custom computing machines | 2004
Miljan Vuletic; Laura Pozzi; Paolo Ienne
Reconfigurable system-on-chip (SoC) platforms that incorporate hard-core processors surrounded by large amounts of FPGA are todays commodities: the reconfigurable logic is often used to speed up execution of applications by implementing critical parts of the code as application-specific coprocessors. Cryptography applications are a good example of coprocessor applications: they are known to benefit significantly from spatial execution in hardware and have an increasing importance for mobile and ubiquitous computing. One of the main limits of FPGA-based coprocessors for these systems is the fact that both the coprocessor hardware description and the software program invoking are inevitably ridden with system details of the specific interface FPGA/processor: this limits significantly design reuse, impacts time-to-market, and makes development more complex. In this paper, we present a portable reconfigurable cryptography coprocessor designed for a virtual memory window (VMW) system. A VMW is a generic virtualisation layer composed of a hardware and an operating system component; it lowers the complexity of interfacing, increases portability, and makes it possible for the coprocessor to access the user-space virtual memory. The approach is illustrated here with the IDEA cryptography application running under Linux on a reconfigurable SoC, having its critical function mapped on the FPGA. A significant fraction of the speed-up inherent to hardware execution in the FPGA is preserved, while the hardware and software designs of the cryptography application become perfectly portable.
field-programmable logic and applications | 2004
Miljan Vuletic; Laura Pozzi; Paolo Ienne
In Reconfigurable Systems-On-Chip (RSoCs), operating sys- tems can primarily (1) manage the sharing of limited reconfigurable resources, and (2) support communication between reconfigurable ac- celerators and user applications. It has been shown in previous work that the operating system can dramatically simplify the interface to re- configurable coprocessors and isolate the programmer from all details of the hardware. A further potential of the operating system is devel- oped here: the operating system can observe accelerators at runtime and dynamically take actions which improve their execution. The strength of involving the operating system consists in achieving better perfor- mance without any information from the end user and without changes either in the coprocessor hardware design or in the software application. Specifically, this paper presents an operating system module that moni- tors reconfigurable coprocessors, predicts their future memory accesses, and performs memory prefetching accordingly; the goal is to hide com- pletely memory-to-memory communication latency. The module uses a lightweight hardware support to detect coprocessors memory access pat- terns. The effectiveness of the technique is demonstrated for two applica- tions on an embedded RSoC board running the Linux operating system. Significant speedup is achieved compared to the nonprefetching version, and the improvement is obtained in a manner completely transparent to the application programmer.
Archive | 2001
Paolo Ienne; Laura Pozzi; Miljan Vuletic
Archive | 2005
Miljan Vuletic; Laura Pozzi; Paolo Ienne
Proceedings of the 1st Workshop on Application Specific Processors | 2002
Ajay K. Verma; Kubilay Atasu; Miljan Vuletic; Laura Pozzi; Paolo Ienne