Karthik Lakshmanan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karthik Lakshmanan is active.

Explore More

Publication

Featured researches published by Karthik Lakshmanan.

real-time systems symposium | 2010

Scheduling Parallel Real-Time Tasks on Multi-core Processors

Karthik Lakshmanan; Shinpei Kato; Ragunathan Rajkumar

Massively multi-core processors are rapidly gaining market share with major chip vendors offering an ever increasing number of cores per processor. From a programming perspective, the sequential programming model does not scale very well for such multi-core systems. Parallel programming models such as OpenMP present promising solutions for more effectively using multiple processor cores. In this paper, we study the problem of scheduling periodic real-time tasks on multiprocessors under the fork join structure used in OpenMP. We illustrate the theoretical best-case and worst-case periodic fork-join task sets from a processor utilization perspective. Based on our observations of these task sets, we provide a partitioned preemptive fixed-priority scheduling algorithm for periodic fork-join tasks. The proposed multiprocessor scheduling algorithm is shown to have a resource augmentation bound of 3.42, which implies that any task set that is feasible on m unit speed processors can be scheduled by the proposed algorithm on m processors that are 3:42 times faster.

information processing in sensor networks | 2010

U-connect: a low-latency energy-efficient asynchronous neighbor discovery protocol

Arvind Kandhalu; Karthik Lakshmanan; Ragunathan Rajkumar

Mobile sensor nodes can be used for a wide variety of applications such as social networks and location tracking. An important requirement for all such applications is that the mobile nodes need to actively discover their neighbors with minimal energy and latency. Nodes in mobile networks are not necessarily synchronized with each other, making the neighbor discovery problem all the more challenging. In this paper, we propose a neighbor discovery protocol called U-Connect, which achieves neighbor discovery at minimal and predictable energy costs while allowing nodes to pick dissimilar duty-cycles. We provide a theoretical formulation of this asynchronous neighbor discovery problem, and evaluate it using the power-latency product metric. We analytically establish that U-Connect is an 1.5-approximation algorithm for the symmetric asynchronous neighbor discovery problem, whereas existing protocols like Quorum and Disco are 2-approximation algorithms. We evaluate the performance of U-Connect and compare the performance of U-Connect with that of existing neighbor discovery protocols. We have implemented U-Connect on our custom portable FireFly Badge hardware platform. A key aspect of our implementation is that it uses a slot duration of only 250μs, and achieves orders of magnitude lower latency for a given duty cycle compared to existing schemes for wireless sensor networks. We provide experimental results from our implementation on a network of around 20 sensor nodes. Finally, we also describe a Friend-Finder application that uses the neighbor discovery service provided by U-Connect.

real-time systems symposium | 2009

On the Scheduling of Mixed-Criticality Real-Time Task Sets

Dionisio de Niz; Karthik Lakshmanan; Ragunathan Rajkumar

The functional consolidation induced by the cost reduction trends in embedded systems can force tasks of different criticality (e.g. ABS Brakes with DVD) to share a processor and interfere with each other. These systems are known as mixed criticality systems. While traditional temporal isolation techniques prevent all inter-task interference, they waste utilization because they need to reserve for the absolute worst-case execution time (WCET) for all tasks. In many mixed-criticality systems the WCET is not only rare, but at times difficult to calculate, such as the time to localize all possible objects in an obstacle avoidance algorithm. In this situation it is more appropriate to allow the execution time to grow by stealing cycles from lower-criticality tasks. Even more crucial is the fact that temporal isolation techniques can stop a high-criticality task (that was overrunning its nomimal WCET) to allow a low-criticality task to run, making the former miss its deadline. We identify this as the criticality inversion problem. In this paper, we characterize the criticality inversion problem and present a new scheduling scheme called zero-slack scheduling that implements an alternative protection scheme we refer to as asymmetric protection. This protection only prevents interference from lower-criticality to higher-criticality tasks and improves the schedulable utilization. We use an offline algorithm with two parts: a zero-slack calculation algorithm, and a slack analysis algorithm. The zero-slack calculation algorithm minimizes the utilization needed by a task set by reducing the time low-criticality tasks are preempted by high-criticality ones. This algorithm can be used with priority-based preemptive schedulers (e.g. RMS, EDF). The slack analysis algorithm is specific for each priority-based preemptive scheduler and we develop and evaluated the one for RMS. We prove that this algorithm provides the same level of protection against criticality inversion as the best known priority assignment for this purpose, criticality as priority assignment (CAPA). We also prove that zero-slack RM provides the same level of schedulable utilization as RMS when all tasks have equal criticality levels. Finally, we present our implementation of the runtime enforcement mechanisms in Linux/RK to demonstrate its practicality.

euromicro conference on real-time systems | 2009

Partitioned Fixed-Priority Preemptive Scheduling for Multi-core Processors

Karthik Lakshmanan; Ragunathan Rajkumar; John P. Lehoczky

Energy and thermal considerations are increasingly driving system designers to adopt multi-core processors. In this paper, we consider the problem of scheduling periodic real-time tasks on multi-core processors using fixed-priority preemptive scheduling. Specifically, we focus on the partitioned (static binding) approach, which statically allocates tasks to processing cores. The well-established 50% bound for partitioned multiprocessor scheduling [10] can be overcome by task-splitting (TS) [19], which allows a task to be split across more than one core. We prove that a utilization bound of 60% per core can be achieved by the partitioned deadline-monotonic scheduling (PDMS) class of algorithms on implicit-deadline task sets, when the highest-priority task on each processing core is allowed to be split (HPTS). Given the widespread usage of fixed-priority scheduling in commercial real-time and non real-time operating systems (e.g. VxWorks, Linux), establishing such utilization bounds is both relevant and useful. We also show that a specific instance of PDMS HPTS, where tasks are allocated in the decreasing order of size, called PDMS HPTS DS, has a utilization bound of 65% on implicit deadline task-sets. The PDMS HPTS DS algorithm also achieves a utilization bound of 69% on lightweight implicit-deadline task-sets where no single task utilization exceeds 41.4%. The average-case behavior of PDMS HPTS DS is studied using randomly generated task-sets, and it is seen to have an average schedulable utilization of 88%. We also characterize the overhead of task-splitting using measurements on an Intel Core 2 Duo processor.

real-time systems symposium | 2009

Coordinated Task Scheduling, Allocation and Synchronization on Multiprocessors

Karthik Lakshmanan; Dionisio de Niz; Ragunathan Rajkumar

Chip-multiprocessors represent a dominant new shift in the field of processor design. Better utilization of such technology in the real-time context requires coordinated approaches to task allocation, scheduling, and synchronization. In this paper, we characterize various scheduling penalties arising from multiprocessor task synchronization, including (i) blocking delays on global critical sections, (ii) back-to-back execution due to jitter from blocking, and (iii) multiple priority inversions due to remote resource sharing. We analyze the impact of these scheduling penalties under different execution control policies (ECPs) which compensate for the scheduling penalties incurred by tasks due to remote blocking. Subsequently, we develop a synchronization-aware task allocation algorithm for explicitly accommodating these global task synchronization penalties. The key idea of our algorithm is to bundle tasks that access a common shared resource and co-locate them, thereby transforming global resource sharing into local sharing. This approach reduces the above-mentioned penalties associated with remote task synchronization. Experimental results indicate that such a coordinated approach to scheduling, allocation, and synchronization yields significant benefits (as much as 50% savings in terms of required number of processing cores). An implementation of this approach is available as a part of our RT-MAP library, which uses the pthreads implementation of Linux-2.6.22.

real-time systems symposium | 2011

RGEM: A Responsive GPGPU Execution Model for Runtime Engines

Shinpei Kato; Karthik Lakshmanan; Aman Kumar; Mihir Kelkar; Yutaka Ishikawa; Ragunathan Rajkumar

General-purpose computing on graphics processing units, also known as GPGPU, is a burgeoning technique to enhance the computation of parallel programs. Applying this technique to real-time applications, however, requires additional support for timeliness of execution. In particular, the non-preemptive nature of GPGPU, associated with copying data to/from the device memory and launching code onto the device, needs to be managed in a timely manner. In this paper, we present a responsive GPGPU execution model (RGEM), which is a user-space runtime solution to protect the response times of high-priority GPGPU tasks from competing workload. RGEM splits a memory-copy transaction into multiple chunks so that preemption points appear at chunk boundaries. It also ensures that only the highest-priority GPGPU task launches code onto the device at any given time, to avoid performance interference caused by concurrent launches. A prototype implementation of an RGEM-based CUDA runtime engine is provided to evaluate the real-world impact of RGEM. Our experiments demonstrate that the response times of high-priority GPGPU tasks can be protected under RGEM, whereas their response times increase in an unbounded fashion without RGEM support, as the data sizes of competing workload increase.

international conference on distributed computing systems | 2010

Resource Allocation in Distributed Mixed-Criticality Cyber-Physical Systems

Karthik Lakshmanan; Dionisio de Niz; Ragunathan Rajkumar; Gabriel A. Moreno

Large-scale distributed cyber-physical systems will have many sensors/actuators (each with local micro-controllers), and a distributed communication/computing backbone with multiple processors. Many cyber-physical applications will be safety critical and in many cases unexpected workload spikes are likely to occur due to unpredictable changes in the physical environment. In the face of such overload scenarios, the desirable property in such systems is that the most critical applications continue to meet their deadlines. In this paper, we capture this mixed-criticality property by developing a formal overload-resilience metric called ductility. The generality of ductility enables it to evaluate any scheduling algorithm from the perspective of mixed-criticality cyber-physical systems. In distributed cyber-physical systems, this ductility is the result of both the task-to-processor packing (a.k.a bin packing) and the uniprocessor scheduling algorithms used. In this paper, we present a ductility-maximization packing algorithm to complement our previous work on mixed-criticality uniprocessor scheduling. Our packing algorithm, known as Compress-on-Overload Packing (COP) is a criticality-aware greedy bin-packing algorithm that maximizes the tolerance of high-criticality tasks to overloads. We compare the ductility of COP against the Worst-Fit Decreasing (WFD) bin-packing heuristic used traditionally for load balancing in distributed systems, and show that the performance of COP dominates WFD in the average case and can reach close to five times better ductility when resources are limited. Finally, we illustrate the practical use of COP in distributed cyber-physical systems using a radar surveillance application, and provide an overview of the entire process from assigning task criticality levels to evaluating its performance

real time technology and applications symposium | 2011

Mixed-Criticality Task Synchronization in Zero-Slack Scheduling

Karthik Lakshmanan; Dionisio de Niz; Ragunathan Rajkumar

Recent years have seen an increasing interest in the scheduling of mixed-criticality real-time systems. These systems are composed of groups of tasks with different levels of criticality deployed over the same processor(s). Such systems must be able to accommodate additional execution-time requirements that may occasionally be needed. When overload conditions develop, critical tasks must still meet their timing constraints at the expense of less critical tasks. Zero-slack scheduling algorithms are promising candidates for such systems. These algorithms guarantee that all tasks meet their deadlines when no overload occurs, and that criticality ordering is satisfied under overloads. Unfortunately, when mutually exclusive resources are shared across tasks, these guarantees are voided. Furthermore, the dual-execution modes of tasks in mixed-criticality systems violate the assumptions of traditional real-time synchronization protocols like PCP and hence the latter cannot be used directly. In this paper, we develop extensions to real-time synchronization protocols (Priority Inheritance and Priority Ceiling Protocol) that coordinate the mode changes of the zero-slack scheduler. We analyze the properties of these new protocols and the blocking terms they introduce. We maintain the deadlock avoidance property of our PCP extension, called the Priority and Criticality Ceiling Protocol (PCCP), and limit the blocking to only one critical section for each of the zero-slack scheduling execution modes. We also develop techniques to accommodate the blocking terms arising from synchronization, in calculating the zero-slack instants used by the scheduler. Finally, we conduct an experimental evaluation of PCCP. Our evaluation shows that PCCP is able to take advantage of the capacity of zero-slack schedulers to reclaim unused over-provisioning of resources that are only used in critical execution modes. This allows PCCP to accommodate larger blocking terms.

real time technology and applications symposium | 2011

Resource Sharing in GPU-Accelerated Windowing Systems

Shinpei Kato; Karthik Lakshmanan; Yutaka Ishikawa; Ragunathan Rajkumar

Recent windowing systems allow graphics applications to directly access the graphics processing unit (GPU) for fast rendering. However, application tasks that render frames on the GPU contend heavily with the windowing server that also accesses the GPU to blit the rendered frames to the screen. This resource-sharing nature of direct rendering introduces core challenges of priority inversion and temporal isolation in multi-tasking environments. In this paper, we identify and address resource-sharing problems raised in GPU-accelerated windowing systems. Specifically, we propose two protocols that enable application tasks to efficiently share the GPU resource in the X Window System. The Priority Inheritance with X server (PIX) protocol eliminates priority inversion caused in accessing the GPU, and the Reserve Inheritance with X server (RIX) protocol addresses the same problem for resource-reservation systems. Our design and implementation of these protocols highlight the fact that neither the X server nor user applications need modifications to use our solutions. Our evaluation demonstrates that multiple GPU-accelerated graphics applications running concurrently in the X Window System can be correctly prioritized and isolated by the PIX and the RIX protocols.

real time technology and applications symposium | 2010

Scheduling Self-Suspending Real-Time Tasks with Rate-Monotonic Priorities

Karthik Lakshmanan; Ragunathan Rajkumar

Recent results have shown that the feasibility problem of scheduling periodic tasks with self-suspensions is NP-hard in the strong sense. We observe that a variation of the problem statement that includes sporadic tasks instead of periodic tasks results in a simple characterization of the critical scheduling instant. This in turn leads to an exact characterization of the critical instant for self-suspending tasks with respect to the interference (preemption) from higher-priority sporadic tasks. Using this characterization, we provide pseudo-polynomial response-time tests for analyzing the schedulability of such self-suspending tasks. Self-suspending tasks can also result in more worst-case interference to lower-priority tasks than their equivalent non-suspending counterparts with zero suspension intervals. Hence, we develop a dynamic slack enforcement scheme, which guarantees that the worst-case interference caused by suspending sporadic tasks is no more than the worst-case interference arising from equivalent non-suspending sporadic tasks without suspension intervals. The worst-case response time of self-suspending sporadic tasks themselves is also shown to be unaffected by dynamic slack enforcement, thereby making it optimal. In order to reduce the runtime complexity of slack enforcement, a static slack enforcement scheme is also developed. Empirical analysis of these schemes and the previously studied period enforcement algorithm shows that static slack enforcement achieves within 3% of the breakdown utilization of dynamic slack enforcement, while period enforcement achieves within 14% of dynamic slack enforcement. System designers can take advantage of these different execution control policies depending on their taskset utilizations and implementation constraints.

Explore More