Robert M. Patton
Oak Ridge National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Robert M. Patton.
ieee international conference on high performance computing data and analytics | 2015
Steven R. Young; Derek C. Rose; Thomas P. Karnowski; Seung-Hwan Lim; Robert M. Patton
There has been a recent surge of success in utilizing Deep Learning (DL) in imaging and speech applications for its relatively automatic feature generation and, in particular for convolutional neural networks (CNNs), high accuracy classification abilities. While these models learn their parameters through data-driven methods, model selection (as architecture construction) through hyper-parameter choices remains a tedious and highly intuition driven task. To address this, Multi-node Evolutionary Neural Networks for Deep Learning (MENNDL) is proposed as a method for automating network selection on computational clusters through hyper-parameter optimization performed via genetic algorithms.
Undergraduate Research Journal | 2008
J S Charles; Robert M. Patton; Thomas E. Potok; Xiaohui Cui
Analyzing and grouping documents by content is a complex problem. One explored method of solving this problem borrows from nature, imitating the flocking behavior of birds. Each bird represents a single document and flies toward other documents that are similar to it. One limitation of this method of document clustering is its complexity O(n 2). As the number of documents grows, it becomes increasingly difficult to receive results in a reasonable amount of time. However, flocking behavior, along with many naturally inspired algorithms such as ant colony optimization and particle swarm optimization, are highly parallel and have found increased performance on expensive cluster computers. In the last few years, the graphics processing unit (GPU) has received attention for its ability to solve highlyparallel and semi-parallel problems much faster than the traditional sequential processor. Some applications see a huge increase in performance on this new platform. The cost of these high-performance devices is also marginal when compared with the price of cluster machines. In this paper, we have conducted research to exploit this architecture and apply its strengths to the document flocking problem. Our results highlight the potential benefit the GPU brings to many naturally inspired algorithms. Using the CUDA platform from NIVIDA®, we developed a document flocking implementation to be run on the NIVIDA® GEFORCE 8800. Additionally, we developed a similar but sequential implementation of the same algorithm to be run on a desktop CPU. We tested the performance of each on groups of news articles ranging in size from 200 to 3000 documents. The results of these tests were very significant. Performance gains ranged from three to nearly five times improvement of the GPU over the CPU implementation. Our results also confirm that each implementation is of similar complexity, confirming that gains are from the hardware and not from algorithmic benefits. This improvement in runtime makes the GPU a potentially powerful new platform for document analysis.
genetic and evolutionary computation conference | 2009
Robert M. Patton; Thomas E. Potok; Barbara G. Beckerman; Jim N. Treadwell
Radiologists disagree with each other over the characteristics and features of what constitutes a normal mammogram and the terminology to use in the associated radiology report. Recently, the focus has been on classifying abnormal or suspicious reports, but even this process needs further layers of clustering and gradation, so that individual lesions can be more effectively classified. Using a genetic algorithm, the approach described here successfully learns phrase patterns for two distinct classes of radiology reports (normal and abnormal). These patterns can then be used as a basis for automatically analyzing, categorizing, clustering, or retrieving relevant radiology reports for the user.
genetic and evolutionary computation conference | 2008
Robert M. Patton; Barbara G. Beckerman; Thomas E. Potok
A genetic algorithm (GA) was developed to implement a maximum variation sampling technique to derive a subset of data from a large dataset of unstructured mammography reports. It is well known that a genetic algorithm performs very well for large search spaces and is easily scalable to the size of the data set. In mammography, much effort has been expended to characterize findings in the radiology reports. Existing computer-assisted technologies for mammography are based on machine-learning algorithms that must learn against a training set with known pathologies in order to further refine the algorithms with higher validity of truth. In a large database of reports and corresponding images, automated tools are needed just to determine which data to include in the training set. This work presents preliminary results showing the use of a GA for finding abnormal reports without a training set. The underlying premise is that abnormal reports should consist of unusual or rare words, thereby making the reports very dissimilar in comparison to other reports. A genetic algorithm was developed to test this hypothesis, and preliminary results show that most abnormal reports in a test set are found and can be adequately differentiated.
Proceedings of SPIE | 2011
Justin M. Beaver; Chad A. Steed; Robert M. Patton; Xiaohui Cui; Matthew A Schultz
Effective visual analysis of computer network defense (CND) information is challenging due to the volume and complexity of both the raw and analyzed network data. A typical CND is comprised of multiple niche intrusion detection tools, each of which performs network data analysis and produces a unique alerting output. The state-of-the-practice in the situational awareness of CND data is the prevalent use of custom-developed scripts by Information Technology (IT) professionals to retrieve, organize, and understand potential threat events. We propose a new visual analytics framework, called the Oak Ridge Cyber Analytics (ORCA) system, for CND data that allows an operator to interact with all detection tool outputs simultaneously. Aggregated alert events are presented in multiple coordinated views with timeline, cluster, and swarm model analysis displays. These displays are complemented with both supervised and semi-supervised machine learning classifiers. The intent of the visual analytics framework is to improve CND situational awareness, to enable an analyst to quickly navigate and analyze thousands of detected events, and to combine sophisticated data analysis techniques with interactive visualization such that patterns of anomalous activities may be more easily identified and investigated.
D-lib Magazine | 2012
Robert M. Patton; Christopher G. Stahl; Thomas E. Potok; J. C. Wells
Scientific user facilities provide physical resources and technical support that enable scientists to conduct experiments or simulations pertinent to their respective research. One metric for evaluating the scientific value or impact of a facility is the number of publications by users as a direct result of using that facility. Unfortunately, for a variety of reasons, capturing accurate values for this metric proves time consuming and error-prone. This work describes a new approach that leverages automated browser technology combined with text analytics to reduce the time and error involved in identifying publications related to user facilities. With this approach, scientific user facilities gain more accurate measures of their impact as well as insight into policy revisions for user access.
hawaii international conference on system sciences | 2012
Robert M. Patton; Wade McNair; Christopher T. Symons; Jim N. Treadwell; Thomas E. Potok
Creating incentives for knowledge workers to share their knowledge within an organization continues to be a challenging task. Strong, innate behaviors of the knowledge worker, such as self-preservation and self-advancement, are difficult to overcome, regardless of the level of knowledge. Many incentive policies simply focus on providing external pressure to promote knowledge sharing. This work describes a technical approach to motivate sharing. Utilizing text analysis and machine learning techniques to create an enhanced knowledge sharing experience, a prototype system was developed and tested at Oak Ridge National Laboratory that reduces the overhead cost of sharing while providing a quick, positive payoff for the knowledge worker. This work describes the implementation and experiences of using the prototype in a corporate production environment.
international conference on wireless communications and mobile computing | 2011
Robert M. Patton; Justin M. Beaver; Chad A. Steed; Thomas E. Potok; Jim N. Treadwell
Most commercial intrusion detections systems (IDS) can produce a very high volume of alerts, and are typically plagued by a high false positive rate. The approach described here uses Splunk to aggregate IDS alerts. The aggregated IDS alerts are retrieved from Splunk programmatically and are then clustered using text analysis and visualized using a sunburst diagram to provide an additional understanding of the data. The equivalent of what the cluster analysis and visualization provides would require numerous detailed queries using Splunk and considerable manual effort.
Journal of Medical Systems | 2011
Carlos C. Rojas; Robert M. Patton; Barbara G. Beckerman
As massive collections of digital health data are becoming available, the opportunities for large-scale automated analysis increase. In particular, the widespread collection of detailed health information is expected to help realize a vision of evidence-based public health and patient-centric health care. Within such a framework for large scale health analytics we describe the transformation of a large data set of mostly unlabeled and free-text mammography data into a searchable and accessible collection, usable for analytics. We also describe several methods to characterize and analyze the data, including their temporal aspects, using information retrieval, supervised learning, and classical statistical techniques. We present experimental results that demonstrate the validity and usefulness of the approach, since the results are consistent with the known features of the data, provide novel insights about it, and can be used in specific applications. Additionally, based on the process of going from raw data to results from analysis, we present the architecture of a generic system for health analytics from clinical notes.
2011 IEEE Symposium on Computational Intelligence in Cyber Security (CICS) | 2011
Justin M. Beaver; Robert M. Patton; Thomas E. Potok
Enterprise networks are comprised of thousands of interconnected computer hosts, each of which is capable of creating, removing, and exchanging data according to the needs of their users. Thus, the distribution of high-value, sensitive, and proprietary information across enterprise networks is poorly managed and understood. A significant technology gap in information security is the inability to automatically quantify the value of the information contained on each host in a network. Such insight would allow an enterprise to scale its defenses, react intelligently to an intrusion, manage its configuration audits, and understand the leak potential in the event that a host is compromised. This paper outlines a novel approach to the automated determination of the value of the information contained on a host computer. It involves the classification of each text document on the host machine using the frequency of the documents terms and phrases. A host information value is computed using an enterprise-defined weighting schema and applying it to a hosts document distribution. The method is adaptable to specific organizational information needs, requires manual intervention only during schema creation, and is repeatable and consistent regardless of changes in information on the host machines.