Juan Segarra | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Juan Segarra is active.

Explore More

Publication

Featured researches published by Juan Segarra.

Journal of Systems Architecture | 2011

Improving the WCET computation in the presence of a lockable instruction cache in multitasking real-time systems

Luis C. Aparicio; Juan Segarra; Clemente Rodríguez; Víctor Viñals

In multitasking real-time systems it is required to compute the WCET of each task and also the effects of interferences between tasks in the worst case. This is very complex with variable latency hardware, such as instruction cache memories, or, to a lesser extent, the line buffers usually found in the fetch path of commercial processors. Some methods disable cache replacement so that it is easier to model the cache behavior. The difficulty in these cache-locking methods lies in obtaining a good selection of the memory lines to be locked into cache. In this paper, we propose an ILP-based method to select the best lines to be loaded and locked into the instruction cache at each context switch (dynamic locking), taking into account both intra-task and inter-task interferences, and we compare it with static locking. Our results show that, without cache, the spatial locality captured by a line buffer doubles the performance of the processor. When adding a lockable instruction cache, dynamic locking systems are schedulable with a cache size between 12.5% and 50% of the cache size required by static locking. Additionally, the computation time of our analysis method is not dependent on the number of possible paths in the task. This allows us to analyze large codes in a relatively short time (100KB with 10^6^5 paths in less than 3min).

embedded and real-time computing systems and applications | 2010

Combining Prefetch with Instruction Cache Locking in Multitasking Real-Time Systems

Luis C. Aparicio; Juan Segarra; Clemente Rodríguez; Víctor Viñals

In multitasking real-time systems it is required to compute the WCET of each task and also the effects of interferences between tasks in the worst case. This is complex with variable latency hardware usually found in the fetch path of commercial processors. Some methods disable cache replacement so that it is easier to model the cache behavior. Lock-MS is an ILP based method to obtain the best selection of memory lines to be locked in a dynamic locking instruction cache. In this paper we first propose a simple memory architecture implementing the next-line tagged prefetch, specially designed for hard real-time systems. Then, we extend Lock-MS to add support for hardware instruction prefetch. Our results show that the WCET of a system with prefetch and an instruction cache with size 5% of the total code size is better than that of a system having no prefetch and cache size 80% of the code. We also evaluate the effects of the prefetch penalty on the resulting WCET, showing that a system without prefetch penalties has a worst-case performance 95\% of the ideal case. This highlights the importance of a good prefetch design. Finally, the computation time of our analysis method is relatively short, analyzing tasks of 96 KB with 10^65 paths in less than 3 minutes.

distributed multimedia systems | 2001

Distribution of Video-on-Demand in Residential Networks

Juan Segarra; Vicent Cholvi

In this paper, we study how to distribute cache sizes into a tree structured server for transmitting video streams through videoon-demand (VoD) way. We use off-line smoothing for videos and our request rates are distributed according to a 24 hour audience curve. For this purpose we have designed a slotted-time bandwidth reservation algorithm, which has been used to simulate our experiments. Our system tests the quality of service (QoS) in terms of starting delay, and once a transmission has started, the system guarantees that it will be transmitted without any delay or quality loss. We tested it for a wide range of users (from 800 to 240 000) and also for a different number of available videos. We demonstrate that a tree structured system with uniform cache sizes performs better than the equivalent system with a proxy-like configuration. We also study delay distribution and bandwidth usage in our system on a representative case.

Computer Communications | 2007

Convergence of periodic broadcasting and video-on-demand

Juan Segarra; Vicent Cholvi

Research on video-on-demand transmissions is essentially divided into periodic broadcasting methods and on-demand methods. Periodic broadcasting is aimed to schedule transmissions off-line, so that an optimized time schedule is achieved. On the other hand video-on-demand has to deal with constraints at requesting times. Thus, studies on these areas have been quite isolated. Obviously, in periodic broadcasting all parameters are known in advance, so timetables can be accurately adjusted and it is assumed transmissions can be arranged to use less bandwidth than video-on-demand. In this paper, we analyze the convergence of both paradigms, showing that the claims that argue that VoD schemes use more bandwidth than PB ones are not necessarily true. We state this argument by proving how to convert any periodic broadcasting method into an on-demand one, which will use equal or less bandwidth. Moreover, we show that this converted on-demand method can also offer shorter serving times.

international conference on computational science | 2003

Simulations on batching in video-on-demand transmissions

Juan Segarra; Vicent Cholvi

One of the methods for taking advantage of multicast services is the use of batching. With this method, several request of the same video are grouped and transmitted together, using only the bandwidth required for one transmission. This method is commonly used in transmission of streamed data. In this paper we analyze the system performance with explicit constant batching, and demonstrate that a system without explicit batching performs better in terms of delays. We also propose a dynamic batching policy which improves the system performance both in mean and in maximum serving times.

embedded and real-time computing systems and applications | 2008

Avoiding the WCET Overestimation on LRU Instruction Cache

Luis C. Aparicio; Juan Segarra; C. Rodriguez; J.L. Villarroel; V. Vials

The WCET computation is one of the main challenges in hard real-time systems, since all further analysis is based on this value. The complexity of this problem leads existing analysis methods to compute WCET bounds instead of the exact WCET. In this work we propose a technique to compute the exact instruction fetch contribution to the WCET (IFC-WCET) in presence of a LRU instruction cache. We prove that an exact computation does not need to analyze the full exponential number of possible execution paths, but only a bounded subset of them. In the benchmark codes we have studied, the IFC-WCET is up to 62% lower than a bound computed with a widely used approach, and the difference between the number of possible execution paths and the ones relevant for the analysis is extremely large.

Computer Communications | 2008

Analysis and placement of storage capacity in large distributed video servers

Vicent Cholvi; Juan Segarra

In this paper, we study how to distribute storage capacity along a hierarchical system with cache-servers located at each node. This system is intended to deliver stored video streams in a video-on-demand way, ensuring that, once started, a transmission will be completed without any delay or quality loss. We use off-line smoothing for videos, dividing them into CBR video parts. Also, our request rates are distributed following a 24h audience curve. In this system, when a request is received, the server reserves the required bandwidth at the required time slots, trying to serve the video as soon as possible. We perform a detailed analysis by means of simulations of the start-up time delay for some storage distributions. It shows that an adequate storage distribution can increase performance about 25% with respect to a uniform distribution and about 47% with respect to one in which all the storage is attached to the gateway routers that connect the final users. We also analyze bandwidth usage, comparing the behavior of these storage distributions. Finally, we present a method which allows dynamic and transparent video reallocations when their popularity changes.

international conference on distributed computing systems workshops | 2004

On-line advancement of transmission plans in video-on-demand

Juan Segarra; Vicente Cholvi

We detail an algorithm for the transmission of video-on-demand, which allows transmissions to be advanced dynamically when there is enough bandwidth. The effect is a transmission speed-up and consequently the freeing of bandwidth reservations in the future. This freed bandwidth can therefore be used by other transmissions. We evaluate two basic criteria for producing these advancements, and also detail how this scheme can be combined with other existing on-demand transmission methods. In our experiments we have combined it with an implementation of patching. We compare the performance of our algorithm with this model, and also with a system without merging taken as a reference. We demonstrate that, during the high audience period, delays resulting from our approach are about 44 % lower than without advancements, and we also state the importance of the criteria used for these advancements.

ACM Transactions in Embedded Computing Systems | 2015

ACDC: Small, Predictable and High-Performance Data Cache

Juan Segarra; Clemente Rodríguez; Ruben Gran; Luis C. Aparicio; Víctor Viñals

In multitasking real-time systems, the worst-case execution time (WCET) of each task and also the effects of interferences between tasks in the worst-case scenario need to be calculated. This is especially complex in the presence of data caches. In this article, we propose a small instruction-driven data cache (256 bytes) that effectively exploits locality. It works by preselecting a subset of memory instructions that will have data cache replacement permission. Selection of such instructions is based on data reuse theory. Since each selected memory instruction replaces its own data cache line, it prevents pollution and performance in tasks becomes independent of the size of the associated data structures. We have modeled several memory configurations using the Lock-MS WCET analysis method. Our results show that, on average, our data cache effectively services 88% of program data of the tested benchmarks. Such results double the worst-case performance of our tested multitasking experiments. In addition, in the worst case, they reach between 75% and 89% of the ideal case of always hitting in instruction and data caches. As well, we show that using partitioning on our proposed hardware only provides marginal benefits in worst-case performance, so using partitioning is discouraged. Finally, we study the viability of our proposal in the MiBench application suite by characterizing its data reuse, achieving hit ratios beyond 90% in most programs.

real time technology and applications symposium | 2012

A Small and Effective Data Cache for Real-Time Multitasking Systems

Juan Segarra; Clemente Rodríguez; Ruben Gran; Luis C. Aparicio; Víctor Viñals

In multitasking real-time systems, the WCET of each task and also the effects of interferences between tasks in the worst-case scenario need to be calculated. This is especially complex with data caches. In this paper, we propose a small instruction-driven data cache (256 bytes) that effectively exploits locality. It works by preselecting a subset of memory instructions that will have data cache replacement permission. Selection of such instructions is based on data reuse theory. Since each selected memory instruction replaces its own data cache line, it prevents pollution and performance in tasks becomes independent of the size of the associated data structures. We have modeled several memory configurations using the Lock-MS WCET analysis method. Our results show that, on average, our data cache effectively services 88% of program data. Such results translate into doubling the performance of the tested real-time multitasking experiments, which (increasing from 75 to 89%) approaches the ideal case of always hitting in instruction and data caches. Additionally, we show that using partitioning on our proposed hardware only provides marginal benefits.

Explore More