Hany Kashif | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hany Kashif is active.

Explore More

Publication

Featured researches published by Hany Kashif.

IEEE Transactions on Computers | 2015

SLA: A Stage-Level Latency Analysisfor Real-Time Communicationin a Pipelined Resource Model

Hany Kashif; Sina Gholamian; Hiren D. Patel

We present a communication analysis for hard real-time systems interconnects. The objective is to provide tight estimates on the worst-case communication latency between communicating processing elements that use a priority-aware communication medium for data transmission. The communication model consists of communication tasks transmitting data across a series of pipelined resources. The analysis incorporates interferences caused by multiple communication tasks requesting the pipelined resources, and it captures parallel transmission of data between multiple pipeline stages. We call this analysis a stage-level analysis. We evaluate the proposed analysis through simulation of synthetic benchmarks, and we apply the analysis to an instantiation of a platform proposed by Shi and Burns. Our experiments confirm that stage-level analysis provides tight upper-bounds when compared to previous work and improves schedulability by 34 percent.

embedded software | 2013

DIME: time-aware dynamic binary instrumentation using rate-based resource allocation

Pansy Arafa; Hany Kashif; Sebastian Fischmeister

Program analysis tools are essential for understanding programs, analyzing performance, and optimizing code. Some of these tools use code instrumentation to extract information at runtime. The instrumentation process can alter program behavior such as timing behavior and memory consumption. Time-sensitive programs, however, must meet specific timing constraints and thus require that the instrumentation process, for instance, bounds the timing overhead. Time-aware instrumentation techniques try to honor the timing constraints of such programs. All previous techniques, however, support only static source-code instrumentation methods. Hence, they become impractical beyond microcontroller code for instrumenting large programs along with all their library dependencies. In this work, we propose DIME, a time-aware dynamic binary instrumentation technique that adds an adjustable bound on the timing overhead to the program under analysis. We implement DIME using the dynamic instrumentation framework, Pin. Quantitative evaluation of the three implementation alternatives shows an average reduction of the instrumentation overhead by 12, 7, and 3 folds compared to native Pin. Instrumenting the VLC media player and a laser beam stabilization experiment demonstrate the practicality and scalability of DIME.

asia and south pacific design automation conference | 2014

Bounding buffer space requirements for real-time priority-aware networks

Hany Kashif; Hiren D. Patel

One implementation alternative for network interconnects in modern chip-multiprocessor systems is priority-aware arbitration networks. To enable the deployment of real-time applications to priority-aware networks, recent research proposes worst-case latency (WCL) analyses for such networks. Buffer space requirements in priority-aware networks, however, are seldom addressed. In this work, we bound the buffer space required for valid WCL analyses and consequently optimize router design for application specifications by computing the required buffer space at each virtual channel in priority-aware routers. In addition to the obvious advantage of bounding buffer space while providing valid WCL bounds, buffer space reduction decreases chip area and saves energy in priority-aware networks. Our experiments show that the proposed buffer space computation reduces the number of unfeasible implementations by 42% compared to an existing buffer space analysis technique. It also reduces the required buffer space in priority-aware routers by up to 79%.

embedded and real-time computing systems and applications | 2013

INSTEP: A static instrumentation framework for preserving extra-functional properties

Hany Kashif; Pansy Arafa; Sebastian Fischmeister

Tracing is a well-established method for debugging programs. Current approaches aim only at preserving functional correctness during the instrumentation. Preservation of functional correctness is a necessary feature of all instrumentation tools. However, few existing instrumentation tools preserve extra-functional properties of a program. Specific classes of software are unable to leverage software instrumentation; e.g., timing for real-time systems, memory consumption for embedded software, and tracing bandwidth for on-board software. We present the first instrumentation framework, INSTEP, that preserves logical correctness and a rich set of extra-functional properties. INSTEP derives instrumentation alternatives based on the developers instrumentation intent (II), abstracts the program and prunes the search space, and then instruments the program based on constraints and cost models of competing properties. We demonstrate and experiment with a fully automated framework of INSTEP with different IIs and extra-functional properties.We also experiment with a large automotive case study to show the scalability of INSTEP.

real time technology and applications symposium | 2013

ORTAP: An Offset-based response time analysis for a pipelined communication resource model

Hany Kashif; Sina Gholamian; Rodolfo Pellizzoni; Hiren D. Patel; Sebastian Fischmeister

This work addresses the challenge of computing worst-case response times of hard real-time applications deployed on multiprocessor systems. In particular, the worst-case response time analysis (WCRTA) focuses on the communication between distributed tasks of hard real-time applications. The proposed WCRTA models the communication as a pipelined communication resource model. This model incorporates the effect of pipelining, and the parallel transmission of data. Applications of such a model include multiprocessor systems that use complex interconnects such as network-on-chips (NoC)s with priorities. In this paper, we present an exponential analysis, and a polynomial analysis, and prove its correctness. As an application, we apply the pipelined communication resource model to priority-aware NoCs, and we compare the proposed analyses against prior analysis techniques. Our experimental evaluation on two instances of 4 × 4 and 8 × 8 NoCs with 512,000 synthetic benchmarks shows 48.3% and 66.7% improvement in schedulability for the two NoC sizes over prior work.

emerging technologies and factory automation | 2012

Program transformation for time-aware instrumentation

Hany Kashif; Sebastian Fischmeister

Instrumentation is a valuable technique to gain insight into a programs behavior. Safety-critical real-time embedded applications are time sensitive and so instrumentation techniques for this domain must especially consider timing. This work establishes the basis for measuring the effectiveness of approaches for time-aware instrumentation in addition to coverage. We define the ETP shift effectiveness metric and define its optimality criterion. We identify locations in the program where program transformation techniques can be applied to increase the instrumentability of the program. We subsequently use the proposed metric to evaluate two transformation methods that improve the effectiveness and coverage of current techniques for time-aware instrumentation by a factor of five.

asia and south pacific design automation conference | 2012

Using link-level latency analysis for path selection for real-time communication on NoCs

Hany Kashif; Hiren D. Patel; Sebastian Fischmeister

We present a path selection algorithm that is used when deploying hard real-time traffic flows onto a chip-multiprocessor system. This chip-multiprocessor system uses a priority-based real-time network-on-chip interconnect between the multiple processors. The problem we address is the following: given a mapping of the tasks onto a chip-multiprocessor system, we need to determine the paths that the traffic flows take such that the flows meet there deadlines. Furthermore, we must ensure that the deadline is met even in the presence of direct and indirect interference from other flows sharing network links on the path. To achieve this, our algorithm utilizes a link-level analysis to determine the impact of a link being used by a flow, and its affect on other flows sharing the link. Our experimental results show that we can improve schedulability by about 8% and 15% over Minimum Interference Routing and Widest Shortest Path algorithms, respectively.

real time technology and applications symposium | 2016

Buffer Space Allocation for Real-Time Priority-Aware Networks

Hany Kashif; Hiren D. Patel

In this work, we address the challenge of incorporating buffer space constraints in worst-case latency analysis for priority-aware networks. A priority-aware network is a wormhole-switched network-on-chip with distinct virtual channels per priority. Prior worst-case latency analyses assume that the routers have infinite buffer space allocated to the virtual channels. This assumption renders these analyses impractical when considering actual deployments. This is because an implementation of the priority-aware network imposes buffer constraints on the application. These constraints can result in back pressure on the communication, which the analyses must incorporate. Consequently, we extend a worst- case latency analysis for priority-aware networks to include buffer space constraints. We provide the theory for these extensions and prove their correctness. We experiment on a large set of synthetic benchmarks, and show that we can deploy applications on priority-aware networks with virtual channels of sizes as small as two flits. In addition, we propose a polynomial time buffer space allocation algorithm. This algorithm minimizes the buffer space required at the virtual channels while scheduling the application sets on the target priority-aware network. Our empirical evaluation shows that the proposed algorithm reduces buffer space requirements in the virtual channels by approximately 85% on average.

modeling analysis and simulation on computer and telecommunication systems | 2017

QDIME: QoS-Aware Dynamic Binary Instrumentation

Pansy Arafa; Guy Martin Tchamgoue; Hany Kashif; Sebastian Fischmeister

Software systems with quality of service (QoS), such as database management systems and web servers, are ubiquitous. Such systems must meet strict performance requirements. Instrumentation is a useful technique for the analysis and debugging of QoS systems. Dynamic binary instrumentation (DBI) extracts runtime information to comprehend systems behavior and detect performance bottlenecks. However, existing DBI tools are intrusive; adding unacceptable delay to the program execution. Such delay alters the performance requirements and degrades the overall quality and the user experience of the system. Moreover, the delay may change the system behavior, thus, producing misleading run-time information.This paper presents QDIME, a QoS-aware dynamic binary instrumentation technique that respects systems performance requirements. QDIME takes a user-defined QoS threshold as an input and periodically gathers QoS feedback from the system under analysis to decide its instrumentation budget.We implemented QDIME on top of PIN, a popular DBI framework. We evaluated QDIME with Gzip, MySQL server, Apache HTTP server, and Redis. The experiments show that QDIME respects the user-defined QoS threshold and, thus, improves the performance of the monitored application by manifolds. QDIME is able to provide up to 100% instrumentation coverage with an average of 92% when compared to PIN. Moreover, QDIME reduces the slow-down factor of the instrumented application by 1.41, 5.67, and 10.26 folds for Sys-trace, Call-trace, and Branch-profile respectively. A release of QDIME is available for download at https://github.com/pansy-arafa/qdime.

ACM Transactions on Design Automation of Electronic Systems | 2016

Path Selection for Real-Time Communication on Priority-Aware NoCs

Hany Kashif; Hiren D. Patel; Sebastian Fischmeister

This work investigates selecting paths for communication flows when deploying a hard real-time application on a chip-multiprocessor system. This chip-multiprocessor system uses a priority-aware real-time network-on-chip interconnect between the processors. Given a mapping of the computation tasks onto the chip-multiprocessor, the problem we address in this work is to discover paths the communication flows take such that hard real-time deadlines of flows are met. Furthermore, we must ensure that deadlines are met even in the presence of direct and indirect interference from other flows sharing network links on the path. To achieve this, our algorithm utilizes a stage-level analysis for real-time communication to determine the impact of a network link being used by a flow, and its effect on other flows sharing the link. The path selection algorithm uses heuristics such as selecting links with least interference, and considering lower-priority flows when dedicating links to paths of higher-priority flows since an optimal one is intractable. The algorithm also considers constraints on the number of virtual channels at each router port in the network. The statistically significant experimental results show an improvement in schedulability by 5% and 12% over existing path selection algorithms such as Minimum Interference Routing and Widest Shortest Path algorithms, respectively. We also present a set-top box case study to further illustrate the benefits of using the proposed algorithm.

Explore More