Is this you? Create Your Porfile

Alan R. Chappell

Pacific Northwest National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Alan R. Chappell is active.

Explore More

Publication

Featured researches published by Alan R. Chappell.

meeting of the association for computational linguistics | 2007

PNNL: A Supervised Maximum Entropy Approach to Word Sense Disambiguation

Stephen C. Tratz; Antonio Sanfilippo; Michelle L. Gregory; Alan R. Chappell; Christian Posse; Paul D. Whitney

In this paper, we described the PNNL Word Sense Disambiguation system as applied to the English all-word task in SemEval 2007. We use a supervised learning approach, employing a large number of features and using Information Gain for dimension reduction. The rich feature set combined with a Maximum Entropy classifier produces results that are significantly better than baseline and are the highest F-score for the fined-grained English all-words subtask of SemEval.

component based software engineering | 2009

Services + Components = Data Intensive Scientific Workflow Applications with MeDICi

Ian Gorton; Jared M. Chase; Adam S. Wynne; Justin Almquist; Alan R. Chappell

Scientific applications are often structured as workflows that execute a series of distributed software modules to analyze large data sets. Such workflows are typically constructed using general-purpose scripting languages to coordinate the execution of the various modules and to exchange data sets between them. While such scripts provide a cost-effective approach for simple workflows, as the workflow structure becomes complex and evolves, the scripts quickly become complex and difficult to modify. This makes them a major barrier to easily and quickly deploying new algorithms and exploiting new, scalable hardware platforms. In this paper, we describe the MeDICi Workflow technology that is specifically designed to reduce the complexity of workflow application development, and to efficiently handle data intensive workflow applications. MeDICi integrates standard component-based and service-based technologies, and employs an efficient integration mechanism to ensure large data sets can be efficiently processed. We illustrate the use of MeDICi with a climate data processing example that we have built, and describe some of the new features we are creating to further enhance MeDICi Workflow applications.

ieee symposium on large data analysis and visualization | 2015

A visual analytics paradigm enabling trillion-edge graph exploration

Pak Chung Wong; David J. Haglin; David S. Gillen; Daniel Chavarria; Vito Giovanni Castellana; Cliff Joslyn; Alan R. Chappell; Song Zhang

We present a visual analytics paradigm and a system prototype for exploring Web-scale graphs. A web-scale graph is described as a graph with ~one trillion edges and ~50 billion vertices. While there is an aggressive R&D effort in processing and exploring Web-Scale graphs among Internet vendors such as Facebook and Google, visualizing a graph of that scale still remains an underexplored R&D area. The paper describes a nontraditional peek-and-filter strategy that facilitates the exploration of a graph database of unprecedented size for visualization and analytics. We demonstrate that our system prototype can (1) preprocess a graph with ~25 billion edges in less than two hours and (2) support database query and interactive visualization on the processed graph database afterward. Based on our computational performance results, we argue that we most likely will achieve the one trillion edge mark (a computational performance improvement of 40 times) for graph visual analytics in the near future.

web intelligence | 2011

Structure Discovery in Large Semantic Graphs Using Extant Ontological Scaling and Descriptive Semantics

Sinan Al-Saffar; Cliff Joslyn; Alan R. Chappell

As semantic datasets grow to be very large and divergent, there is a need to identify and exploit their inherent semantic structure for discovery and optimization. Towards that end, we present here a novel methodology to identify the semantic structures inherent in an arbitrary semantic graph dataset. We first present the concept of an extant ontology as a statistical description of the semantic relations present amongst the typed entities modeled in the graph. This serves as a model of the underlying semantic structure to aid in discovery and visualization. We then describe a method of ontological scaling in which the ontology is employed as a hierarchical scaling filter to infer different resolution levels at which the graph structures are to be viewed or analyzed. We illustrate these methods on three large and publicly available semantic datasets containing more than one billion edges each.

ieee international conference semantic computing | 2016

Effective Tooling for Linked Data Publishing in Scientific Research

Sumit Purohit; William P. Smith; Alan R. Chappell; Patrick West; Benno Lee; Eric G. Stephan; Peter Fox

Challenges that make it difficult to find, share, and combine published data, such as data heterogeneity and resource discovery, have led to increased adoption of semantic data standards and data publishing technologies. To make data more accessible, interconnected and discoverable, some domains are being encouraged to publish their data as Linked Data. Consequently, this trend greatly increases the amount of data that semantic web tools are required to process, store, and interconnect. In attempting to process and manipulate large data sets, tools -- ranging from simple text editors to modern triplestores -- eventually breakdown upon reaching undefined thresholds. This paper shares our experiences in curating metadata, primarily to illustrate the challenges, and resulting limitations that data publishers and consumers have in the current technological environment. This paper also provides a Linked Data based solution to the research problem of resource discovery, and offers a systematic approach that the data publishers can take to select suitable tools to meet their data publishing needs. We present a real-world use case, the Resource Discovery for Extreme Scale Collaboration (RDESC), which features a scientific dataset(maximum size of 1.4 billion triples) used to evaluate a toolbox for data publishing in climate research. This paper also introduces a semantic data publishing software suite developed for the RDESC project.

annual acis international conference on computer and information science | 2015

Enhancing the impact of science data toward data discovery and reuse

Alan R. Chappell; Jesse Weaver; Sumit Purohit; William P. Smith; Karen L. Schuchardt; Patrick West; Benno Lee; Peter Fox

The a mount of data produced in support of scientific research continues to grow rapidly. Despite the accumulation and demand for scientific data, relatively little data are actually made available for the broader scientific community. We surmise that one root of this problem is the perceived difficulty of electronically publishing scientific data and associated metadata in a way that makes it discoverable. We propose exploiting Semantic Web technologies and best practices to make metadata both discoverable and easy to publish. We share experiences in curating metadata to illustrate the cumbersome nature of data reuse in the current research environment. We also make recommendations with a real-world example of how data publishers can provide their metadata by adding limited additional markup to HTML pages on the Web. With little additional effort from data publishers, the difficulty of data discovery, access, and sharing can be greatly reduced and the impact of research data greatly enhanced.

bioinformatics and biomedicine | 2012

Annotating the structure and components of a nanoparticle formulation using computable string expressions

Dennis G. Thomas; Satish Chikkagoudar; Alan R. Chappell; Nathan A. Baker

Nanoparticle formulations that are being developed and tested for various medical applications are typically multi-component systems that vary in their structure, chemical composition, and function. It is difficult to compare and understand the differences between the structural and chemical descriptions of hundreds and thousands of nanoparticle formulations found in text documents. We have developed a string nomenclature to create computable string expressions that identify and enumerate the different high-level types of material parts of a nanoparticle formulation and represent the spatial order of their connectivity to each other. The string expressions are intended to be used as IDs, along with terms that describe a nanoparticle formulation and its material parts, in data sharing documents and nanomaterial research databases. The strings can be parsed and represented as a directed acyclic graph. The nodes of the graph can be used to display the string ID, name and other text descriptions of the nanoparticle formulation or its material part, while the edges represent the connectivity between the material parts with respect to the whole nanoparticle formulation. The different patterns in the string expressions can be searched for and used to compare the structure and chemical components of different nanoparticle formulations. The proposed string nomenclature is extensible and can be applied along with ontology terms to annotate the complete description of nanoparticles formulations.

2012 3rd International Workshop on Cognitive Information Processing (CIP) | 2012

Pattern discovery using semantic network analysis

Robin Burk; Alan R. Chappell; Michelle L. Gregory; Cliff Joslyn; Liam R. McGrath

Cognitive information processing at higher conceptual levels requires a computational approach to knowledge representation and analysis. Semantic network analysis bridges the gap between probabilistic pattern recognition techniques and symbolic representations by replacing cumbersome and computationally complex forms of logic-based semantic inference common in symbolic approaches with mathematical metrics on graph representations of labelled, directed semantic networked data. These metrics in turn support assessment of evidentiary support for the presence of patterns of interest in which entities play specified roles in complex event scenarios. The resulting system allows patterns to be specified at higher levels of conceptual abstraction while also remaining robust to conflicting and incomplete information.

2011 IEEE Network Science Workshop | 2011

Semantic network analysis for evidence evaluation: The threat anticipation initiative

Robin Burk; Mark Davis; Michele Morara; Steve Rust; Alan R. Chappell; Michelle L. Gregory; Liam R. McGrath; Cliff Joslyn

Semantic network analysis offers a computational method for discovery, pattern matching, and reasoning with large amounts of unstructured, semi-structured and structured information. The Threat Anticipation Platform replaces more cumbersome and computationally complex forms of semantic inference with metrics on graph representations of labeled, directed semantic networked data to identify the degree of evidence within multiple data sources for specified hypotheses about potential events.

Archive | 2017

Studying Military Community Health, Well-Being, and Discourse Through the Social Media Lens

Umashanthi Pavalanathan; Vivek V. Datla; Svitlana Volkova; Lauren Charles-Smith; Meg Pirrung; Josh Harrison; Alan R. Chappell; Courtney D. Corley

Social media can provide a resource for characterizing communities and targeted populations through activities and content shared online. For instance, studying the armed forces’ use of social media may provide insights into their health and well-being. In this paper, we address three broad research questions: (1) How do military populations use social media? (2) What topics do military users discuss in social media? (3) Do military users talk about health and well-being differently than civilians? Military Twitter users were identified through keywords in the profile description of users who posted geo-tagged tweets at military installations. These military tweets were compared with the tweets from remaining population. Our analysis indicates that military users talk more about military related responsibilities and events, whereas nonmilitary users talk more about school, work, and leisure activities. A significant difference in online content generated by both populations was identified, involving sentiment, health, language, and social media features.

Explore More