Sam Henry
Virginia Commonwealth University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sam Henry.
international conference on machine learning and applications | 2010
Theodoros Damoulas; Sam Henry; Andrew Farnsworth; Michael Lanzone; Carla P. Gomes
In this paper we propose a probabilistic classification algorithm with a novel Dynamic Time Warping (DTW) kernel to automatically recognize flight calls of different species of birds. The performance of the method on a real world dataset of warbler (Parulidae) flight calls is competitive to human expert recognition levels and outperforms other classifiers trained on a variety of feature extraction approaches. In addition we offer a novel and intuitive DTW kernel formulation which is positive semi-definite in contrast with previous work. Finally we obtain promising results with a larger dataset of multiple species that we can handle efficiently due to the explicit multiclass probit likelihood of the proposed approach.
Journal of Biomedical Informatics | 2017
Sam Henry; Bridget T. McInnes
OBJECTIVES This paper provides an introduction and overview of literature based discovery (LBD) in the biomedical domain. It introduces the reader to modern and historical LBD models, key system components, evaluation methodologies, and current trends. After completion, the reader will be familiar with the challenges and methodologies of LBD. The reader will be capable of distinguishing between recent LBD systems and publications, and be capable of designing an LBD system for a specific application. TARGET AUDIENCE From biomedical researchers curious about LBD, to someone looking to design an LBD system, to an LBD expert trying to catch up on trends in the field. The reader need not be familiar with LBD, but knowledge of biomedical text processing tools is helpful. SCOPE This paper describes a unifying framework for LBD systems. Within this framework, different models and methods are presented to both distinguish and show overlap between systems. Topics include term and document representation, system components, and an overview of models including co-occurrence models, semantic models, and distributional models. Other topics include uninformative term filtering, term ranking, results display, system evaluation, an overview of the application areas of drug development, drug repurposing, and adverse drug event prediction, and challenges and future directions. A timeline showing contributions to LBD, and a table summarizing the works of several authors is provided. Topics are presented from a high level perspective. References are given if more detailed analysis is required.
north american chapter of the association for computational linguistics | 2016
Sam Henry; Allison Sands
VRep is a system designed for SemEval 2016 Task 1 Semantic Textual Similarity (STS) and Task 2 Interpretable Semantic Textual Similarity (iSTS). STS quantifies the semantic equivalence between two snippets of text, and iSTS provides a reason why those snippets of text are similar. VRep makes extensive use of WordNet for both STS, where the Vector relatedness measure is used, and for iSTS, where features are extracted to create a learned rule-based classifier. This paper outlines the VRep algorithm, provides results from the 2016 SemEval competition, and analyzes the performance contributions of the system components.
Journal of Biomedical Informatics | 2018
Sam Henry; Clint Cuffy; Bridget T. McInnes
This paper presents a comparison between several multi-word term aggregation methods of distributional context vectors applied to the task of semantic similarity and relatedness in the biomedical domain. We compare the multi-word term aggregation methods of summation of component word vectors, mean of component word vectors, direct construction of compound term vectors using the compoundify tool, and direct construction of concept vectors using the MetaMap tool. Dimensionality reduction is critical when constructing high quality distributional context vectors, so these baseline co-occurrence vectors are compared against dimensionality reduced vectors created using singular value decomposition (SVD), and word2vec word embeddings using continuous bag of words (CBOW), and skip-gram models. We also find optimal vector dimensionalities for the vectors produced by these techniques. Our results show that none of the tested multi-word term aggregation methods is statistically significantly better than any other. This allows flexibility when choosing a multi-word term aggregation method, and means expensive corpora preprocessing may be avoided. Results are shown with several standard evaluation datasets, and state of the results are achieved.
BioNLP 2017 | 2017
Sam Henry; Clint Cuffy; Bridget T. McInnes
In this paper, we present an analysis of feature extraction methods via dimensionality reduction for the task of biomedical Word Sense Disambiguation (WSD). We modify the vector representations in the 2-MRD WSD algorithm, and evaluate four dimensionality reduction methods: Word Embeddings using Continuous Bag of Words and Skip Gram, Singular Value Decomposition (SVD), and Principal Component Analysis (PCA). We also evaluate the effects of vector size on the performance of each of these methods. Results are evaluated on five standard evaluation datasets (Abbrev.100, Abbrev.200, Abbrev.300, NLM-WSD, and MSH-WSD). We find that vector sizes of 100 are sufficient for all techniques except SVD, for which a vector size of 1500 is preferred. We also show that SVD performs on par with Word Embeddings for all but one dataset.
Proceedings of SPIE | 2015
Sam Henry
The ability to detect and identify camouflaged targets is critical in combat environments. Hyperspectral and Multispectral cameras allow a soldier to identify threats more effectively than traditional RGB cameras due to both increased color resolution and ability to see beyond visible light. Static imagers have proven successful, however the development of video rate imagers allows for continuous real time target identification and tracking. This paper presents an analysis of existing anomaly detection algorithms and how they can be adopted to video rates, and presents a general purpose semisupervised real time anomaly detection algorithm using multiple frame sampling.
Proceedings of SPIE | 2015
Mehrube Mehrubeoglu; Michael Zemlan; Sam Henry
Food safety and quality in packaged products are paramount in the food processing industry. To ensure that packaged products are free of foreign materials, such as debris and pests, unwanted materials mixed with the targeted products must be detected before packaging. A portable hyperspectral imaging system in the visible-to-NIR range has been used to acquire hyperspectral data cubes from pinto beans that have been mixed with foreign matter. Bands and band ratios have been identified as effective features to develop a classification scheme for detection of foreign materials in pinto beans. A support vector machine has been implemented with a quadratic kernel to separate pinto beans and background (Class 1) from all other materials (Class 2) in each scene. After creating a binary classification map for the scene, further analysis of these binary images allows separation of false positives from true positives for proper removal action during packaging.
Archive | 2013
Henry M. Dante; Sam Henry; Seetharama C. Deevi
Archive | 2013
Seetharama C. Deevi; Henry M. Dante; Qiwei Liang; Sam Henry
Archive | 2013
Henry M. Dante; Sam Henry; Seetharama C. Deevi