John A. Springer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John A. Springer is active.

Explore More

Publication

Featured researches published by John A. Springer.

CBE- Life Sciences Education | 2014

A Survey of Scholarly Literature Describing the Field of Bioinformatics Education and Bioinformatics Educational Research

Alejandra J. Magana; Manaz Taleyarkhan; Daniela Rivera Alvarado; Michael D. Kane; John A. Springer; Kari Clase

This article provides an overview of the state of research in bioinformatics education in the years 1998 through 2013. It identifies current curricular approaches for integrating bioinformatics education, concepts and skills being taught, pedagogical approaches and methods of delivery, and educational research and evaluation results.

The American Journal of Pharmaceutical Education | 2011

Pharmacogenomics Training Using an Instructional Software System

John A. Springer; Nicholas V. Iannotti; Michael D. Kane; Kevin Haynes; Jon E. Sprague

Objectives. To implement an elective course in pharmacogenomics designed to teach pharmacy students about the fundamentals of pharmacogenomics and the anticipated changes it will bring to the profession. Design. The 8 sessions of the course covered the basics of pharmacogenomics, genomic biotechnology, implementation of pharmacogenetics in pharmacy, information security and privacy, ethical issues related to the use of genomic data, pharmacoepidemiology, and use and promotion of GeneScription, a software program designed to mimic the professional pharmacy environment. Assessment. Student grades were based on completion of a patient education pamphlet, a 2-page paper on pharmacogenomics, and precourse and postcourse survey instruments. In the postcourse survey, all students strongly agreed that genomic data could be used to determine the optimal dose of a drug and genomic data for metabolizing enzymes could be stored in a safe place. Students also were more willing to submit deoxyribonucleic acid (DNA) data for genetic profiling and better understood how DNA analysis is performed after completing the course. Conclusions. An elective course in pharmacogenomics equipped pharmacy students with the basic knowledge necessary to make clinical decisions based on pharmacogenomic data and to teach other healthcare professionals and patients about pharmacogenomics. For personalized medicine to become a reality, all pharmacists and pharmacy students must learn this knowledge and these skills.

international conference on automation robotics and applications | 2015

Wireless Sensor Network and Big Data in Cooperative Fire Security system using HARMS

Bakytgul Khaday; Eric T. Matson; John A. Springer; Young Ki Kwon; Hansu Kim; Sunbin Kim; Daulet Kenzhebalin; Cho Sukyeong; Jinwoong Yoon; Hong Seung Woo

Growing population and shortage of land in urban areas led to development of tall buildings. Tall buildings have advantages and at the same time disadvantages. One of the disadvantages is that they are not fully safe in fire situations, because fire trucks cannot reach them. Fire danger can be prevented and eliminated if it is detected early. Implementing WSN (Wireless Sensor Network) and Big Data and collecting-sending data to other members of Cooperative Fire Security System using Human Agent Robot Machine Sensor (CFS2H) messaging protocol establishes faster communication and collaboration among all the members of the whole system. The stationary WSN generates and analyzes the data and wirelessly communicates with other members of the system. Big Data is the central data manipulating center which communicates with all the system members and controls the whole system work. This paper presents detailed implementation and application of WSN and Big Data in cooperative firefighting system.

Proceedings of the 2015 Workshop on Changing Landscapes in HPC Security | 2015

Toward a Data Spillage Prevention Process in Hadoop using Data Provenance

Oluwatosin O. Alabi; Joe Beckman; Melissa Dark; John A. Springer

Recent data breaches involving large companies have demonstrated that the loss of control over protected and confidential data can become a serious threat to business operations and national security. As the use of Hadoop continues to grow rapidly, the development of methods for addressing security challenges related to Hadoop becomes imperative, and in this paper, we describe our efforts to remedy one such challenge, data spillage. We discuss our work in developing a conceptual framework for collecting provenance data and investigating data spillage within our Hadoop cluster and review some preliminary finding from our test case looking at data spillage in the Hadoop Distributed File System (HDFS). We also distill our lessons learned and mention activities already underway to continue this work.

International Journal of Business Process Integration and Management | 2010

Scientific workflow management systems and workflow patterns

Amruta Shiroor; John A. Springer; Thomas J. Hacker; Brandeis Marshall; Jeffrey L. Brewer

Scientific workflow management systems primarily consist of data flow oriented execution models, and consequently, these systems provide a limited number of control flow constructs that are represented in dissimilar ways across different scientific workflow systems. This is a problem, since the exploratory nature of scientific analysis requires the workflows to dynamically adapt to external events and control execution of different workflow components. Hence some degree of control flow is necessary. The lack of standard specifications for specifying control flow constructs in scientific workflow management systems leads to workflows designed using custom developed components with almost no reusability. In this paper, we present a standard set of control flow constructs for scientific workflow management systems using workflow patterns. Firstly we compare the control flow constructs present in three scientific workflow management systems: Kepler, Taverna and Triana. Secondly these patterns are implemented in the form of a template library in Kepler. Finally, we demonstrate the use of this template library to design scientific workflows.

conference on information technology education | 2007

Integrating bioinformatics, distributed data management, and distributed computing for applied training in high performance computing

Michael D. Kane; John A. Springer

The utilization of multi-core and multi-node parallel high performance computing (HPC) systems is growing rapidly to meet computational demands in the scientific computing arena. For example, the exponential growth of genomic data has outpaced increases in single CPU clock speeds by 15-fold over the last 20 years, placing great value on the use of parallel processing systems in bioinformatics. Fortunately, increased demand for multi-node architectures has resulted in decreased costs for distributed computing components making these architectures more affordable to organizations and institutions. As the demand for HPC computer architectures grows, so does the demand for professionals skilled in the implementation, utilization and administration of these systems. With the goal of training undergraduate and graduate students to meet this demand, a model HPC training module has been developed and implemented that integrates bioinformatics, distributed data management and distributed computing. In this HPC training module bioinformatics provides exposure to applied scientific computing as well as the rationale for multi-processor computing to overcome large computational problems. In addition, the parallelization of computing is explored from the classic divide-and-conquer approach, as well as the distributed data management perspective, which places emphasis on the network bandwidth and disk paging as detractors to HPC performance. Students participate in the HPC module through hands-on interactions with three different HPC cluster types: (1) Beowulf, (2) blade servers, and (3) multi-processor shared memory systems. The results of this training module include exploratory student projects to determine mathematical relationships between HPC performance and (1) processing nodes, (2) cluster type, (3) database size and segmentation methods, (4) bioinformatics application type, (5) RAM per node, and (6) network bandwidth. The outcome of this training module is hands-on training in HPC across multiple cluster types, and across multiple computer and information technology perspectives.

conference on information technology education | 2008

Meeting the data challenge: curriculum development for parallel data systems

Thomas J. Hacker; John A. Springer

The emergence of commodity-based high performance computing systems and low-cost storage systems in concert with the continued proliferation of data has created a significant need for technologists with expertise in parallel data systems. The training in this area, though, falls outside the traditional boundaries of the data management curriculum. In this paper, we describe our efforts in developing a new course focused on parallel data systems, which exploit the power of high performance computing and commodity hardware to deliver high throughput and well-scaled storage systems. We describe in detail the trends and forces driving the need for this course, the topics to be covered in this course, the data laboratory to be used with the course, assessment methods to measure student progress, and desired learning outcomes for the course.

International Journal of Social Research Methodology | 2015

Seeing beyond the datasets: the unexploited structure of electoral and institutional data

Thomas Mustillo; John A. Springer

We propose relational data modeling as a tool for replacing the ad hoc and uncoordinated approaches commonly used throughout the social sciences to gather, store, and disseminate data. We demonstrate relational data modeling using global electoral and political institutional data. We define a relational data model as a map of concepts, their attributes, and the relationships between concepts developed using a formal language and according to a set of rules. To demonstrate the methodology, we design a simple relational data model of six concepts: countries, parties, elections, districts, institutions, and election results. Furthermore, we introduce a data model to solve the particularly vexing issue of party discontinuity (party splits, mergers, and alliances). We show how the solution facilitates computational tasks, such as the calculation of core measures of political phenomena (ex: electoral volatility). Ultimately, a relational data approach will play a central role in collective investments to develop advanced data capabilities, and thereby advance the accuracy, pace, and transparency of scholarship in the social sciences.

international conference on bioinformatics | 2014

AlignMR: mass spectrometry peak alignment using Hadoop MapReduce

Urmi Bhayani; John A. Springer

Proteomics is the study of the structure and behavior of proteins, and one of the primary approaches to protein identification and quantification is through the analysis of Mass Spectrometry (MS) data. This analysis typically involves a series of different computational steps, and the Purdue University Bindley Bioscience Center employs a computational workflow system, the Omics Discovery Pipeline (ODP), to assist in its analysis of MS data. One of the ODPs stages entails aligning the peaks in the MS data across multiple subjects, and due to the large number of subjects that may be used in a study and the large number of peaks found in each subjects corresponding MS data, the alignment step qualifies as a data intensive computation. This research focuses on using Apache Hadoop MapReduce to align the processed MS data in a computationally faster manner than the serial approach currently used in the ODP.

international conference on bioinformatics | 2013

pXAlign: A parallel implementation of XAlign

Aditi A. Magikar; John A. Springer

Proteomics involves the assessment of a large number of protein molecules, and mass spectrometry is a proteomic tool that is used for assessment of these protein molecules. As an example, the Proteome Discovery Pipeline at Purdue University Bindley Bioscience Center carries out data processing and discovery of proteins using mass spectrometry-based proteomics. Each stage of the Proteome Discovery Pipeline does a different computation task, and currently, each stage of the pipeline is executed in a serial manner. The XAlign stage of the pipeline enables data processing and alignment of the protein peaks across different samples. The XAlign stage deals with vast amounts of data, and this can be a potential data processing bottleneck in the pipeline. Moreover, the serial nature of the XAlign code can cause additional bottlenecks. Using commonly used parallelization techniques, MPI and OpenMP, our work introduces parallelism into the XAlign code to investigate possible performance improvements, and as a result, we found a notable speedup of the XAlign software.

Explore More