Featured Researches

Performance

Age of Information in an Overtake-Free Network of Quasi-Reversible Queues

We show how to calculate the Age of Information in an overtake-free network of quasi-reversible queues, with exponential exogenous interarrivals of multiple classes of update packets and exponential service times at all nodes. Results are provided for any number of M/M/1 First-Come-First-Served (FCFS) queues in tandem, and for a network with two classes of update packets, entering through different queues in the network and exiting through the same queue. The main takeaway is that in a network with different classes of update packets, individual classes roughly preserve the ages they would achieve if they were alone in the network, except when shared queues become saturated, in which case the ages increase considerably.

Read more
Performance

Age-of-Information in the Presence of Error

We consider the peak age-of-information (PAoI) in an M/M/1 queueing system with packet delivery error, i.e., update packets can get lost during transmissions to their destination. We focus on two types of policies, one is to adopt Last-Come-First-Served (LCFS) scheduling, and the other is to utilize retransmissions, i.e., keep transmitting the most recent packet. Both policies can effectively avoid the queueing delay of a busy channel and ensure a small PAoI. Exact PAoI expressions under both policies with different error probabilities are derived, including First-Come-First-Served (FCFS), LCFS with preemptive priority, LCFS with non-preemptive priority, Retransmission with preemptive priority, and Retransmission with non-preemptive priority. Numerical results obtained from analysis and simulation are presented to validate our results.

Read more
Performance

Agile Calibration Process of Full-Stack Simulation Frameworks for V2X Communications

Computer simulations and real-world car trials are essential to investigate the performance of Vehicle-to-Everything (V2X) networks. However, simulations are imperfect models of the physical reality and can be trusted only when they indicate agreement with the real-world. On the other hand, trials lack reproducibility and are subject to uncertainties and errors. In this paper, we will illustrate a case study where the interrelationship between trials, simulation, and the reality-of-interest is presented. Results are then compared in a holistic fashion. Our study will describe the procedure followed to macroscopically calibrate a full-stack network simulator to conduct high-fidelity full-stack computer simulations.

Read more
Performance

Algorithms of evaluation of the waiting time and the modelling of the terminal activity

This paper approaches the application of the waiting model with Poisson inputs and priorities in the port activity. The arrival of ships in the maritime terminal is numerically modelled, and specific parameters for the distribution functions of service and of inputs are determined, in order to establish the waiting time of ships in the seaport and a stationary process. The modelling is based on waiting times and on the traffic coefficient.

Read more
Performance

An Analytical Solution for Probabilistic Guarantees of Reservation Based Soft Real-Time Systems

We show a methodology for the computation of the probability of deadline miss for a periodic real-time task scheduled by a resource reservation algorithm. We propose a modelling technique for the system that reduces the computation of such a probability to that of the steady state probability of an infinite state Discrete Time Markov Chain with a periodic structure. This structure is exploited to develop an efficient numeric solution where different accuracy/computation time trade-offs can be obtained by operating on the granularity of the model. More importantly we offer a closed form conservative bound for the probability of a deadline miss. Our experiments reveal that the bound remains reasonably close to the experimental probability in one real-time application of practical interest. When this bound is used for the optimisation of the overall Quality of Service for a set of tasks sharing the CPU, it produces a good sub-optimal solution in a small amount of time.

Read more
Performance

An ECM-based energy-efficiency optimization approach for bandwidth-limited streaming kernels on recent Intel Xeon processors

We investigate an approach that uses low-level analysis and the execution-cache-memory (ECM) performance model in combination with tuning of hardware parameters to lower energy requirements of memory-bound applications. The ECM model is extended appropriately to deal with software optimizations such as non-temporal stores. Using incremental steps and the ECM model, we analytically quantify the impact of various single-core optimizations and pinpoint microarchitectural improvements that are relevant to energy consumption. Using a 2D Jacobi solver as example that can serve as a blueprint for other memory-bound applications, we evaluate our approach on the four most recent Intel Xeon E5 processors (Sandy Bridge-EP, Ivy Bridge-EP, Haswell-EP, and Broadwell-EP). We find that chip energy consumption can be reduced in the range of 2.0-2.4 × on the examined processors.

Read more
Performance

An Efficient Hybrid I/O Caching Architecture Using Heterogeneous SSDs

SSDs are emerging storage devices which unlike HDDs, do not have mechanical parts and therefore, have superior performance compared to HDDs. Due to the high cost of SSDs, entirely replacing HDDs with SSDs is not economically justified. Additionally, SSDs can endure a limited number of writes before failing. To mitigate the shortcomings of SSDs while taking advantage of their high performance, SSD caching is practiced in both academia and industry. Previously proposed caching architectures have only focused on either performance or endurance and neglected to address both parameters in suggested architectures. Moreover, the cost, reliability, and power consumption of such architectures is not evaluated. This paper proposes a hybrid I/O caching architecture that while offers higher performance than previous studies, it also improves power consumption with a similar budget. The proposed architecture uses DRAM, Read-Optimized SSD, and Write-Optimized SSD in a three-level cache hierarchy and tries to efficiently redirect read requests to either DRAM or RO-SSD while sending writes to WO-SSD. To provide high reliability, dirty pages are written to at least two devices which removes any single point of failure. The power consumption is also managed by reducing the number of accesses issued to SSDs. The proposed architecture reconfigures itself between performance- and endurance-optimized policies based on the workload characteristics to maintain an effective tradeoff between performance and endurance. We have implemented the proposed architecture on a server equipped with industrial SSDs and HDDs. The experimental results show that as compared to state-of-the-art studies, the proposed architecture improves performance and power consumption by an average of 8% and 28%, respectively, and reduces the cost by 5% while increasing the endurance cost by 4.7% and negligible reliability penalty.

Read more
Performance

An In-Depth Investigation of Performance Characteristics of Hyperledger Fabric

Private permissioned blockchains, such as Hyperledger Fabric, are widely deployed across the industry to facilitate cross-organizational processes and promise improved performance compared to their public counterparts. However, the lack of empirical and theoretical results prevent precise prediction of the real-world performance. We address this gap by conducting an in-depth performance analysis of Hyperledger Fabric. The paper presents a detailed compilation of various performance characteristics using an enhanced version of the Distributed Ledger Performance Scan. Researchers and practitioners alike can use the results as guidelines to better configure and implement their blockchains and utilize the DLPS framework to conduct their measurements.

Read more
Performance

An Infinite Dimensional Model for a Many Server Priority Queue

We consider a Markovian many server queueing system in which customers are preemptively scheduled according to exogenously assigned priority levels. The priority levels are randomly assigned from a continuous probability measure rather than a discrete one and hence, the queue is modeled by an infinite dimensional stochastic process. We analyze the equilibrium behavior of the system and provide several results. We derive the Radon-Nikodym derivative (with respect to Lebesgue measure) of the measure that describes the average distribution of customer priority levels in the system; we provide a formula for the expected sojourn time of a customer as a function of his priority level; and we provide a formula for the expected waiting time of a customer as a function of his priority level. We verify our theoretical analysis with discrete-event simulations. We discuss how each of our results generalizes previous work on infinite dimensional models for single server priority queues.

Read more
Performance

An analysis of core- and chip-level architectural features in four generations of Intel server processors

This paper presents a survey of architectural features among four generations of Intel server processors (Sandy Bridge, Ivy Bridge, Haswell, and Broad- well) with a focus on performance with floating point workloads. Starting on the core level and going down the memory hierarchy we cover instruction throughput for floating-point instructions, L1 cache, address generation capabilities, core clock speed and its limitations, L2 and L3 cache bandwidth and latency, the impact of Cluster on Die (CoD) and cache snoop modes, and the Uncore clock speed. Using microbenchmarks we study the influence of these factors on code performance. This insight can then serve as input for analytic performance models. We show that the energy efficiency of the LINPACK and HPCG benchmarks can be improved considerably by tuning the Uncore clock speed without sacrificing performance, and that the Graph500 benchmark performance may profit from a suitable choice of cache snoop mode settings.

Read more

Ready to get started?

Join us today