Justin Almquist
Pacific Northwest National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Justin Almquist.
working ieee/ifip conference on software architecture | 2008
Ian Gorton; Adam S. Wynne; Justin Almquist; Jack Chatterton
Building high performance analytical applications for data streams generated from sensors is a challenging software engineering problem. Such applications typically comprise a complex pipeline of processing components that capture, transform and analyze the incoming data stream. In addition, applications must provide high throughput, be scalable and easily modifiable so that new analytical components can be added with minimum effort. In this paper we describe the MeDICi integration framework (MIF), which is a middleware platform we have created to address these challenges. The MIF extends an open source messaging platform with a component-based API for integrating components into analytical pipelines. We describe the features and capabilities of the MIF, and show how it has been used to build a production analytical application for detecting cyber security attacks. The application was composed from multiple independently developed components using several different programming languages. The resulting application was able to process network sensor traffic in real time and provide insightful feedback to network analysts as soon as potential attacks were recognized.
enterprise distributed object computing | 2003
Ian Gorton; Justin Almquist; Nick Cramer; Jereme N. Haack; Mark Hoza
Large-scale information processing environments must rapidly search through massive streams of raw data to locate useful information. These data streams contain textual and numeric data items, and may be highly structured or mostly freeform text. This project aims to create a high performance and scalable engine for locating relevant content in data streams. Based on the J2EE Java Messaging Service (JMS), the content-based messaging (CBM) engine provides highly efficient message formatting and filtering. This paper describes the design of the CBM engine, and presents empirical results that compare the performance with a standard JMS to demonstrate the performance improvements that are achieved.
ieee congress on services | 2009
Jared M. Chase; Ian Gorton; Chandrika Sivaramakrishnan; Justin Almquist; Adam S. Wynne; George Chin; Terence Critchlow
Scientific applications are often structured as workflows that execute a series of interdependent, distributed software modules to analyze large data sets. The order of execution of the tasks in a workflow is commonly controlled by complex scripts, which over time become difficult to maintain and evolve. In this paper, we describe how we have integrated the Kepler scientific workflow platform with the MeDICi Integration Framework, which has been specifically designed to provide a standards-based, lightweight and flexible integration platform. The MeDICi technology provides a scalable, component-based architecture that efficiently handles integration with heterogeneous, distributed software systems. This paper describes the MeDICi Integration Framework and the mechanisms we used to integrate MeDICi components with Kepler workflow actors. We illustrate this solution with a workflow application for an atmospheric sciences application. The resulting solution promotes a strong separation of concerns, simplifying the Kepler workflow description and promoting the creation of a reusable collection of components available for other workflow applications in this domain.
hawaii international conference on system sciences | 2005
Ian Gorton; Justin Almquist; Kevin E. Dorow; Peng Gong; Dave Thurman
Integrating multiple heterogeneous data sources in to applications is a time-consuming, costly and error-prone engineering task. Relatively mature technologies exist that make integration tractable from an engineering perspective. These technologies however have many limitations, and hence present opportunities for breakthrough research. This paper briefly describes some of these limitations. It then provides an overview of the Data Concierge research project and prototype that is attempting to provide solutions to some of these limitations. The paper focuses on the core architecture and mechanisms in the Data Concierge for dynamically attaching to a previously unidentified source of information. The generic API supported by the Data Concierge is described, along with the architecture and prototype tools for describing the meta-data necessary to facilitate dynamic integration. In addition, we describe the outstanding challenges that remain to be solved before the Data Concierge concept can be realized.
component based software engineering | 2009
Ian Gorton; Jared M. Chase; Adam S. Wynne; Justin Almquist; Alan R. Chappell
Scientific applications are often structured as workflows that execute a series of distributed software modules to analyze large data sets. Such workflows are typically constructed using general-purpose scripting languages to coordinate the execution of the various modules and to exchange data sets between them. While such scripts provide a cost-effective approach for simple workflows, as the workflow structure becomes complex and evolves, the scripts quickly become complex and difficult to modify. This makes them a major barrier to easily and quickly deploying new algorithms and exploiting new, scalable hardware platforms. In this paper, we describe the MeDICi Workflow technology that is specifically designed to reduce the complexity of workflow application development, and to efficiently handle data intensive workflow applications. MeDICi integrates standard component-based and service-based technologies, and employs an efficient integration mechanism to ensure large data sets can be efficiently processed. We illustrate the use of MeDICi with a climate data processing example that we have built, and describe some of the new features we are creating to further enhance MeDICi Workflow applications.
information reuse and integration | 2010
Arzu Gosney; Christopher S. Oehmen; Adam S. Wynne; Justin Almquist
Large computing systems including clusters, clouds, and grids, provide high-performance capabilities that can be utilized for scientific applications. As the ubiquity of these systems increases and the scope of analysis performed on them expand, there is a growing need for applications that do not require users to learn the details of high-performance computing, and are flexible and adaptive to accommodate the best time-to-solution. In this paper we introduce a new adaptive capability for the MeDICi middleware and describe the applicability of this design to a scientific workflow application for biology. This adaptive framework provides a programming model for implementing a workflow using high-performance systems and enables the compute capabilities at one site to automatically analyze data being generated at another site. This adaptive design improves overall time-to-solution by moving the data analysis task to the most appropriate resource dynamically, automatically reacting to failures and load fluctuations.
2007 Sixth International IEEE Conference on Commercial-off-the-Shelf (COTS)-Based Software Systems (ICCBSS'07) | 2007
David A. Thurman; Justin Almquist; Ian Gorton; Adam S. Wynne; Jack Chatterton
Architectures and technologies for enterprise application integration are relatively mature, resulting in a range of standards-based and proprietary COTS middleware technologies. However, in the domain of complex analytical applications, integration architectures are not so well understood. Analytical applications such as those used in scientific discovery and financial and intelligence analysis exert unique demands on their underlying architectures. These demands make existing COTS integration middleware less suitable for use in enterprise analytics environments. In this paper we describe SIFT (Scalable Information Fusion and Triage), an application architecture designed for integrating the various components that comprise enterprise analytics applications. SIFT exploits a common pattern for composing analytical components, and extends an existing messaging platform with dynamic configuration mechanisms and scaling capabilities. We demonstrate the use of SIFT to create a decision support platform for quality control based on large volumes of incoming delivery data. The strengths and weaknesses of the SIFT solution are discussed, and we conclude by describing where further work is required to create a complete solution applicable to a wide range of analytical application domains
international conference on software engineering | 2004
Justin Almquist; Ian Gorton; Jereme N. Haack
Large-scale information processing applications must rapidly search through high volume streams of structured and unstructured textual data to locate useful information. Content-based messaging systems (CBMSs) provide a powerful technology platform for building such stream handling systems. CBMSs make it possible to efficiently execute queries on messages in streams to extract those that contain content of interest. In this paper, we describe efforts to augment an experimental CBMS with the ability to perform efficient free-text search operations. The design of the CBMS platform, based upon a Java Messaging Service, is described, and an empirical evaluation is presented to demonstrate the performance implications of a range of queries varying in complexity.
Archive | 2009
Ian Gorton; Adam S. Wynne; Justin Almquist; Jack Chatterton; Jared M. Chase; Pnnl Allan Chappell
Archive | 2012
Christopher S. Oehmen; Scott T. Dowson; Wes Hatley; Justin Almquist; Bobbie-Jo M. Webb-Robertson; Jason McDermott; Ian Gorton; Lee Ann McCue; Deborah K. Gracio