Stephen F. Jenks | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Stephen F. Jenks is active.

Explore More

Publication

Featured researches published by Stephen F. Jenks.

Real-time Systems | 2007

A middleware model supporting time-triggered message-triggered objects for standard Linux systems

Stephen F. Jenks; Kane Kim; Yuqing Li; Sheng Liu; Liangchen Zheng; Moon Hae Kim; Hee Yong Youn; Kyung Hee Lee; Dong-Myung Seol

Abstract The Time-triggered Message-triggered Object (TMO) programming and specification scheme came out of an effort to remove the limitations of conventional object structuring techniques in developing real-time (RT) distributed computing components and composing distributed computing applications out of such components and others. It is a natural and syntactically small but semantically powerful extension of the object oriented (OO) design and implementation techniques which allows the system designer to specify in natural and yet precise forms timing requirements imposed on data and function components of high-level distributed computing objects. TMO Support Middleware (TMOSM) was devised to be an efficient middleware architecture that can be easily adapted to many commercial-off-the-shelf (COTS) hardware + kernel operating system platforms to form efficient TMO execution engines. However, up until 2003, its adaptations were done for Microsoft Windows platforms only. As we have been developing and refining an adaptation of TMOSM to the Linux 2.6 operating system platform in recent years, TMOSM has been refined to possess further improved modularity and portability. This paper presents the refined TMOSM as well as the techniques developed for efficient adaptation of TMOSM to the Linux 2.6 platform.

parallel computing | 2011

Architectural support for thread communications in multi-core processors

Sevin Varoglu; Stephen F. Jenks

In the ongoing quest for greater computational power, efficiently exploiting parallelism is of paramount importance. Architectural trends have shifted from improving single-threaded application performance, often achieved through instruction level parallelism (ILP), to improving multithreaded application performance by supporting thread level parallelism (TLP). Thus, multi-core processors incorporating two or more cores on a single die have become ubiquitous. To achieve concurrent execution on multi-core processors, applications must be explicitly restructured to exploit parallelism, either by programmers or compilers. However, multithreaded parallel programming may introduce overhead due to communications among threads. Though some resources are shared among processor cores, current multi-core processors provide no explicit communications support for multithreaded applications that takes advantage of the proximity between cores. Currently, inter-core communications depend on cache coherence, resulting in demand-based cache line transfers with their inherent latency and overhead. In this paper, we explore two approaches to improve communications support for multithreaded applications. Prepushing is a software controlled data forwarding technique that sends data to destinations cache before it is needed, eliminating cache misses in the destinations cache as well as reducing the coherence traffic on the bus. Software Controlled Eviction (SCE) improves thread communications by placing shared data in shared caches so that it can be found in a much closer location than remote caches or main memory. Simulation results show significant performance improvement with the addition of these architecture optimizations to multi-core processors.

international conference on intelligent sensors, sensor networks and information | 2007

A Dynamic Cluster Formation Algorithm for Collaborative Information Processing in Wireless Sensor Networks

Chia-Yen Shih; Stephen F. Jenks

Clustering of sensor nodes has been shown to be an effective approach for distributed collaborative information processing in resource constrained wireless sensor networks to keep network traffic local in order to reduce energy dissipation of long-distance transmissions. Defining the range and topology of clusters to reduce energy consumption and retransmissions due to collisions on shared radio channels is an ongoing research topic. One solution is to minimize overlapping cluster ranges to reduce signal contention, thus reducing the energy dissipation of collisions and retransmissions. In this paper, we propose a dynamic cluster formation (DCF) algorithm that dynamically groups a set of sensor nodes into a logical cluster-based sensing and processing unit, collaborative agent sensor team (CAST), to detect and track localized phenomena. Each cluster head is selected locally such that the total overlap between clusters in a CAST is low and the coverage of each cluster head is high. We compare our approach with optimal solutions, and simulation results show the effectiveness and scalability of our CAST DCF algorithm.

high performance computing systems and applications | 2002

An evaluation of thread migration for exploiting distributed array locality

Stephen F. Jenks; Jean-Luc Gaudiot

Thread migration is one approach to remote memory accesses on distributed memory parallel computers. In thread migration, threads of control migrate between processors to access data local to those processors, while conventional approaches tend to move data to the threads that need them. Migration approaches enhance spatial locality by making large address spaces local, but are less adept at exploiting temporal locality. Data-moving approaches, such as cached remote memory fetches or distributed shared memory, can use both types of locality. We present experimental evaluation of thread migrations ability to reduce the impact of remote array accesses across distributed-memory computers. Nomadic Threads uses compiler-generated fine-grain threads which either migrate to make data local or fetch cache lines, tolerating latency with multithreading. We compare these alternatives using various array access patterns.

international parallel and distributed processing symposium | 2008

Architecture optimizations for synchronization and communication on chip multiprocessors

Sevin Fide; Stephen F. Jenks

Chip multiprocessors (CMPs) enable concurrent execution of multiple threads using several cores on a die. Current CMPs behave much like symmetric multiprocessors and do not take advantage of the proximity between cores to improve synchronization and communication between concurrent threads. Thread synchronization and communication instead use memory/cache interactions. We propose two architectural enhancements to support fine grain synchronization and communication between threads that reduce overhead and memory/cache contention. Register-based synchronization exploits the proximity between cores to provide low-latency shared registers for synchronization. This approach can save significant power over spin waiting when blocking events that suspend the core are used. Pre-pushing provides software controlled data forwarding between caches to reduce coherence traffic and improve cache latency and hit rates. We explore the behavior of these approaches, and evaluate their effectiveness at improving synchronization and communication performance on CMPs with private caches. Our simulation results show significant reduction in inter-core traffic, latencies, and miss rates.

international symposium on object component service oriented real time distributed computing | 2005

A Linux-based implementation of a middleware model supporting time-triggered message-triggered objects

Stephen F. Jenks; K. H. Kim; E. Henrich; Yuqing Li; Liangchen Zheng; Moon-Hong Kim; Kyung Hee Lee; Dong-Myung Seol; Hee Yong Youn

Programming and composing deterministic distributed real-time systems is becoming increasingly important, yet remains difficult and error-prone. An innovative approach to such systems is the general-form timeliness-guaranteed design paradigm, which is the basis for the time-triggered message-triggered object (TMO) programming and system specification scheme. This approach was originally developed for Windows programming environments and operating systems. This paper describes the techniques needed to make TMO support the Linux operating system and reports the resulting performance characteristics.

IEEE Computer Architecture Letters | 2008

Proactive Use of Shared L3 Caches to Enhance Cache Communications in Multi-Core Processors

Sevin Fide; Stephen F. Jenks

The software and hardware techniques to exploit the potential of multi-core processors are falling behind, even though the number of cores and cache levels per chip is increasing rapidly. There is no explicit communications support available, and hence inter-core communications depend on cache coherence protocols, resulting in demand-based cache line transfers with their inherent latency and overhead. In this paper, we present software controlled eviction (SCE) to improve the performance of multithreaded applications running on multi-core processors by moving shared data to shared cache levels before it is demanded from remote private caches. Simulation results show that SCE offers significant performance improvement (8-28%) and reduces L3 cache misses by 88-98%.

international conference on parallel architectures and compilation techniques | 1997

Exploiting locality and tolerating remote memory access latency using thread migration

Stephen F. Jenks; Jean-Luc Gaudiot

Much research has focused on reducing and/or tolerating remote memory access latencies on distributed-memory parallel computers. Caching remote data is intended to reduce average access latency by handling as many remote memory accesses as possible using local copies of the data in the cache. Data-flow and multithreaded approaches help programs tolerate the latency of remote memory accesses by allowing processors to do other work while remote operations take place. The thread migration technique described here is a multithreaded architecture where threads migrate to remote processors that contain data they need. By exploiting access locality, the threads often use several data items from that processor before migrating to other processors for more data. Because the threads migrate in search of data, the approach is called Nomadic Threads. A prototype runtime system has been implemented on the CM5 and is portable to other distributed memory parallel computers.

software technologies for embedded and ubiquitous systems | 2009

HiperSense: An Integrated System for Dense Wireless Sensing and Massively Scalable Data Visualization

Pai H. Chou; Chong-Jing Chen; Stephen F. Jenks; Sung-jin Kim

HiperSense is a system for sensing and data visualization. Its sensing part is comprised of a heterogeneous wireless sensor network (WSN) as enabled by infrastructure support for handoff and bridging. Handoff support enables simple, densely deployed, low-complexity, ultra-compact wireless sensor nodes operating at non-trivial data rates to achieve mobility by connecting to different gateways automatically. Bridging between multiple WSN standards is achieved by creating virtual identities on the gateways. The gateways collect data over Fast Ethernet for data post-processing and visualization. Data visualization is done on HIPerWall, a 200-megapixel display wall consisting of 5 rows by 10 columns of 30-inch displays. Such a powerful system is designed to minimize complexity on the sensor nodes while retaining high flexibility and high scalability.

Future Generation Computer Systems | 2006

A middleware architecture to facilitate distributed programming DAROC: data-activated replicated object communications

Brian M. Stack; Gene Hsiao; Stephen F. Jenks

Programming distributed computer systems is difficult because of complexities in addressing remote entities, message handling, and program coupling. As systems grow, scalability becomes critical, as bottlenecks can serialize portions of the system. When these distributed system aspects are exposed to programmers, code size and complexity grow, as does the fragility of the system. This paper describes a distributed software architecture and middleware implementation that combines object-based blackboard-style communications with data-driven and periodic application scheduling to greatly simplify distributed programming while achieving scalable performance. Data-Activated Replication Object Communications (DAROC) allows programmers to treat shared objects as local variables while providing implicit communications.

Explore More