Is this you? Create Your Porfile

Nathan O. Hodas

Pacific Northwest National Laboratory

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nathan O. Hodas is active.

Explore More

Publication

Featured researches published by Nathan O. Hodas.

Journal of Computational Chemistry | 2017

Deep learning for computational chemistry

Garrett B. Goh; Nathan O. Hodas; Abhinav Vishnu

The rise and fall of artificial neural networks is well documented in the scientific literature of both computer science and computational chemistry. Yet almost two decades later, we are now seeing a resurgence of interest in deep learning, a machine learning algorithm based on multilayer neural networks. Within the last few years, we have seen the transformative impact of deep learning in many domains, particularly in speech recognition and computer vision, to the extent that the majority of expert practitioners in those field are now regularly eschewing prior established models in favor of deep learning models. In this review, we provide an introductory overview into the theory of deep neural networks and their unique properties that distinguish them from traditional machine learning algorithms used in cheminformatics. By providing an overview of the variety of emerging applications of deep neural networks, we highlight its ubiquity and broad applicability to a wide range of challenges in the field, including quantitative structure activity relationship, virtual screening, protein structure prediction, quantum chemistry, materials design, and property prediction. In reviewing the performance of deep neural networks, we observed a consistent outperformance against non‐neural networks state‐of‐the‐art models across disparate research topics, and deep neural network‐based models often exceeded the “glass ceiling” expectations of their respective tasks. Coupled with the maturity of GPU‐accelerated computing for training deep neural networks and the exponential growth of chemical data on which to train these networks on, we anticipate that deep learning algorithms will be a valuable tool for computational chemistry.

meeting of the association for computational linguistics | 2017

Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter.

Svitlana Volkova; Kyle Shaffer; Jin Yea Jang; Nathan O. Hodas

Pew research polls report 62 percent of U.S. adults get news on social media (Gottfried and Shearer, 2016). In a December poll, 64 percent of U.S. adults said that “made-up news” has caused a “great deal of confusion” about the facts of current events (Barthel et al., 2016). Fabricated stories in social media, ranging from deliberate propaganda to hoaxes and satire, contributes to this confusion in addition to having serious effects on global stability. In this work we build predictive models to classify 130 thousand news posts as suspicious or verified, and predict four sub-types of suspicious news – satire, hoaxes, clickbait and propaganda. We show that neural network models trained on tweet content and social network interactions outperform lexical models. Unlike previous work on deception detection, we find that adding syntax and grammar features to our models does not improve performance. Incorporating linguistic features improves classification results, however, social interaction features are most informative for finer-grained separation between four types of suspicious news posts.

international world wide web conferences | 2015

Disentangling the Lexicons of Disaster Response in Twitter

Nathan O. Hodas; Greg Ver Steeg; Joshua J. Harrison; Satish Chikkagoudar; Eric B. Bell; Courtney D. Corley

People around the world use social media platforms such as Twitter to express their opinion and share activities about various aspects of daily life. In the same way social media changes communication in daily life, it also is transforming the way individuals communicate during disasters and emergencies. Because emergency officials have come to rely on social media to communicate alerts and updates, they must learn how users communicate disaster related content on social media. We used a novel information-theoretic unsupervised learning tool, CorEx, to extract and characterize highly relevant content used by the public on Twitter during known emergencies, such as fires, explosions, and hurricanes. Using the resulting analysis, authorities may be able to score social media content and prioritize their attention toward those messages most likely to be related to the disaster.

meeting of the association for computational linguistics | 2017

Intrinsic and Extrinsic Evaluation of Spatiotemporal Text Representations in Twitter Streams

Lawrence Phillips; Kyle Shaffer; Dustin Arendt; Nathan O. Hodas; Svitlana Volkova

Language in social media is a dynamic system, constantly evolving and adapting, with words and concepts rapidly emerging, disappearing, and changing their meaning. These changes can be estimated using word representations in context, over time and across locations. A number of methods have been proposed to track these spatiotemporal changes but no general method exists to evaluate the quality of these representations. Previous work largely focused on qualitative evaluation, which we improve by proposing a set of visualizations that highlight changes in text representation over both space and time. We demonstrate usefulness of novel spatiotemporal representations to explore and characterize specific aspects of the corpus of tweets collected from European countries over a two-week period centered around the terrorist attacks in Brussels in March 2016. In addition, we quantitatively evaluate spatiotemporal representations by feeding them into a downstream classification task – event type prediction. Thus, our work is the first to provide both intrinsic (qualitative) and extrinsic (quantitative) evaluation of text representations for spatiotemporal trends.

2016 IEEE Symposium on Technologies for Homeland Security (HST) | 2016

Adaptive visual sort and summary of micrographic images of nanoparticles for forensic analysis

Elizabeth Jurrus; Nathan O. Hodas; Nathan A. Baker; Tim Marrinan; Mark D. Hoover

Image classification of nanoparticles from scanning electron microscopes for nuclear forensic analysis is a long, time consuming process. Months of analyst time may initially be required to sift through images in order to categorize morphological characteristics associated with nanoparticle identification. Subsequent assessment of newly acquired images against identified characteristics can be equally time consuming. We present INStINCt, our Intelligent Signature Canvas, as a framework for quickly organizing image data in a web-based canvas framework that partitions images based on features derived from convolutional neural networks. This work is demonstrated using particle images from an aerosol study conducted by Pacific Northwest National Laboratory under the auspices of the U.S. Army Public Health Command to determine depleted uranium aerosol doses and risks.

knowledge discovery and data mining | 2018

Using Rule-Based Labels for Weak Supervised Learning: A ChemNet for Transferable Chemical Property Prediction

Garrett B. Goh; Charles Siegel; Abhinav Vishnu; Nathan O. Hodas

With access to large datasets, deep neural networks (DNN) have achieved human-level accuracy in image and speech recognition tasks. However, in chemistry data is inherently small and fragmented. In this work, we develop an approach of using rule-based knowledge for training ChemNet, a transferable and generalizable deep neural network for chemical property prediction that learns in a weak-supervised manner from large unlabeled chemical databases. When coupled with transfer learning approaches to predict other smaller datasets for chemical properties that it was not originally trained on, we show that ChemNets accuracy outperforms contemporary DNN models that were trained using conventional supervised learning. Furthermore, we demonstrate that the ChemNet pre-training approach is equally effective on both CNN (Chemception) and RNN (SMILES2vec) models, indicating that this approach is network architecture agnostic and is effective across multiple data modalities. Our results indicate a pre-trained ChemNet that incorporates chemistry domain knowledge and enables the development of generalizable neural networks for more accurate prediction of novel chemical properties.

intelligent user interfaces | 2018

SHARKZOR: Human in the Loop ML for User-Defined Image Classification

Meg Pirrung; Nathan Hilliard; Nancy O'Brien; Artëm Yankov; Court D. Corley; Nathan O. Hodas

Sharkzor is a web-based user interface for user defined image classification. We present here a human in the loop system with interactions focusing on 3 main user tasks. The user triages a number of images by organizing them into arbitrary groups with few examples. Sharkzors sophisticated few-shot learning back end then approximates the users mental model and automates organization of the entire dataset.

computational social science | 2018

Model of cognitive dynamics predicts performance on standardized tests

Nathan O. Hodas; Jacob S. Hunter; Stephen J. Young; Kristina Lerman

In the modern knowledge economy, success demands sustained focus and high cognitive performance. Research suggests that human cognition is linked to a finite resource, and upon its depletion, cognitive functions such as self-control and decision-making may decline. While fatigue, among other factors, affects human activity, how cognitive performance evolves during extended periods of focus remains poorly understood. By analyzing performance of a large cohort answering practice standardized test questions online, we show that accuracy and learning decline as the test session progresses and recover following prolonged breaks. To explain these findings, we hypothesize that answering questions consumes some finite cognitive resources on which performance depends, but these resources recover during breaks between test questions. We propose a dynamic mechanism of the consumption and recovery of these resources and show that it explains empirical findings and predicts performance better than alternative hypotheses. While further controlled experiments are needed to identify the physiological origin of these phenomena, our work highlights the potential of empirical analysis of large-scale human behavior data to explore cognitive behavior.

arXiv: Machine Learning | 2017