Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where William P. Smith is active.

Publication


Featured researches published by William P. Smith.


international conference on data engineering | 2017

NOUS: Construction and Querying of Dynamic Knowledge Graphs

Sutanay Choudhury; Khushbu Agarwal; Sumit Purohit; Baichuan Zhang; Meg Pirrung; William P. Smith; Mathew Thomas

The ability to construct domain specific knowledge graphs (KG) and perform question-answering or hypothesis generation is a transformative capability. Despite their value, automated construction of knowledge graphs remains an expensive technical challenge that is beyond the reach for most enterprises and academic institutions. We propose an end-toend framework for developing custom knowledge graph driven analytics for arbitrary application domains. The uniqueness of our system lies A) in its combination of curated KGs along with knowledge extracted from unstructured text, B) support for advanced trending and explanatory questions on a dynamic KG, and C) the ability to answer queries where the answer is embedded across multiple data sources.


ieee international conference semantic computing | 2016

Effective Tooling for Linked Data Publishing in Scientific Research

Sumit Purohit; William P. Smith; Alan R. Chappell; Patrick West; Benno Lee; Eric G. Stephan; Peter Fox

Challenges that make it difficult to find, share, and combine published data, such as data heterogeneity and resource discovery, have led to increased adoption of semantic data standards and data publishing technologies. To make data more accessible, interconnected and discoverable, some domains are being encouraged to publish their data as Linked Data. Consequently, this trend greatly increases the amount of data that semantic web tools are required to process, store, and interconnect. In attempting to process and manipulate large data sets, tools -- ranging from simple text editors to modern triplestores -- eventually breakdown upon reaching undefined thresholds. This paper shares our experiences in curating metadata, primarily to illustrate the challenges, and resulting limitations that data publishers and consumers have in the current technological environment. This paper also provides a Linked Data based solution to the research problem of resource discovery, and offers a systematic approach that the data publishers can take to select suitable tools to meet their data publishing needs. We present a real-world use case, the Resource Discovery for Extreme Scale Collaboration (RDESC), which features a scientific dataset(maximum size of 1.4 billion triples) used to evaluate a toolbox for data publishing in climate research. This paper also introduces a semantic data publishing software suite developed for the RDESC project.


Information Systems Frontiers | 2016

Semantic catalog of things, services, and data to support a wind data management facility

Eric G. Stephan; Todd O. Elsethagen; Larry K. Berg; Matthew C. Macduff; Patrick R. Paulson; Will Shaw; Chitra Sivaraman; William P. Smith; Adam Wynne

Transparency and data integrity are crucial to any scientific study wanting to garner impact and credibility in the scientific community. The purpose of this paper is to discuss how this can be achieved using what we define as the Semantic Catalog. The catalog exploits community vocabularies as well as linked open data best practices to seamlessly describe and link things, data, and off-the-shelf (OTS) services to support scientific offshore wind energy research for the U.S. Department of Energy’s Office of Energy Efficiency and Renewable Energy (EERE) Wind and Water Power Program. This is largely made possible by leveraging collaborative advances in the Internet of Things (IoT), Semantic Web, Linked Services, Linked Open Data (LOD), and Resource Description Framework (RDF) vocabulary communities, which provides the foundation for our design. By adapting these linked community best practices, we designed a wind characterization Data Management Facility (DMF) capable of continuous data collection, processing, and preservation of in situ and remote sensing instrument measurements. The design incorporates the aforementioned Semantic Catalog which provides a transparent and ubiquitous interface for its user community to the things, data, and services for which the DMF is composed.


annual acis international conference on computer and information science | 2015

Enhancing the impact of science data toward data discovery and reuse

Alan R. Chappell; Jesse Weaver; Sumit Purohit; William P. Smith; Karen L. Schuchardt; Patrick West; Benno Lee; Peter Fox

The a mount of data produced in support of scientific research continues to grow rapidly. Despite the accumulation and demand for scientific data, relatively little data are actually made available for the broader scientific community. We surmise that one root of this problem is the perceived difficulty of electronically publishing scientific data and associated metadata in a way that makes it discoverable. We propose exploiting Semantic Web technologies and best practices to make metadata both discoverable and easy to publish. We share experiences in curating metadata to illustrate the cumbersome nature of data reuse in the current research environment. We also make recommendations with a real-world example of how data publishers can provide their metadata by adding limited additional markup to HTML pages on the Web. With little additional effort from data publishers, the difficulty of data discovery, access, and sharing can be greatly reduced and the impact of research data greatly enhanced.


Archive | 2018

Text-Based Analytics for Biosurveillance

Lauren E. Charles; William P. Smith; Jeremiah Rounds; Joshua Mendoza

The ability to prevent, mitigate, or control a biological threat depends on how quickly the threat is identified and characterized. Ensuring the timely delivery of data and analytics is an essential aspect of providing adequate situational awareness in the face of a disease outbreak. This chapter outlines an analytic pipeline for supporting an advanced early warning system that can integrate multiple data sources and provide situational awareness of potential and occurring disease situations. The pipeline includes real-time automated data analysis founded on natural language processing, semantic concept matching, and machine learning techniques, to enrich content with metadata related to biosurveillance. Online news articles are presented as a use case for the pipeline, but the processes can be generalized to any textual data. In this chapter, the mechanics of a streaming pipeline are briefly discussed as well as the major steps required to provide targeted situational awareness. The text-based analytic pipeline includes various processing steps as well as identifying article relevance to biosurveillance (e.g., relevance algorithm) and article feature extraction (who, what, where, why, how, and when).


ieee symposium series on computational intelligence | 2016

Identification of program signatures from cloud computing system telemetry data

Nicole Nichols; Mark Greaves; William P. Smith; Ryan LaMothe; Gianluca Longoni; Jeremy R. Teuton

Malicious cloud computing activity can take many forms, including running unauthorized programs in a virtual environment. Detection of these malicious activities while preserving the privacy of the user is an important research challenge. Prior work has shown the potential viability of using cloud service billing metrics as a mechanism for proxy identification of malicious programs. Previously this novel detection method has been evaluated in a synthetic and isolated computational environment.


collaboration technologies and systems | 2015

Deep web scientific sensor measurements usage: A standards-based approach

Eric G. Stephan; Alan R. Chappell; Chitra Sivaraman; Sumit Purohit; William P. Smith; Bernadette Farias Lóscio

There is a growing need for scientists producing sensor measurements from scientific studies to make these available to global consumer scientific communities. The authors believe that this can largely be achieved by relying on well-established international standards bodies such as the World Wide Web Consortium (w3.org), building momentum in scientific communities to establish best practices for data publication, and creating collaborative capabilities for consumers to explore, reuse, and contribute their knowledge about the data.


ICBO | 2015

Medical and Transmission Vector Vocabulary Alignment with Schema.org

William P. Smith; Alan R. Chappell; Courtney D. Corley


SR+SWIT@ISWC | 2016

Remembering the Important Things: Semantic Importance in Stream Reasoning

Rui Yan; Mark Greaves; William P. Smith; Deborah L. McGuinness


LDOW@WWW | 2016

Towards A Cache-Enabled, Order-Aware, Ontology-Based Stream Reasoning Framework.

Rui Yan; Brenda Praggastis; William P. Smith; Deborah L. McGuinness

Collaboration


Dive into the William P. Smith's collaboration.

Top Co-Authors

Avatar

Alan R. Chappell

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Sumit Purohit

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Deborah L. McGuinness

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Eric G. Stephan

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar

Rui Yan

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Benno Lee

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Chitra Sivaraman

Pacific Northwest National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Patrick West

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar

Peter Fox

Rensselaer Polytechnic Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge