Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrew L. Beam is active.

Publication


Featured researches published by Andrew L. Beam.


Reproductive Toxicology | 2012

Zebrafish developmental screening of the ToxCast™ Phase I chemical library

Stephanie Padilla; D. Corum; Beth Padnos; Deborah L. Hunter; Andrew L. Beam; Keith A. Houck; Nisha S. Sipes; Nicole C. Kleinstreuer; Thomas B. Knudsen; David J. Dix; David M. Reif

Zebrafish (Danio rerio) is an emerging toxicity screening model for both human health and ecology. As part of the Computational Toxicology Research Program of the U.S. EPA, the toxicity of the 309 ToxCast™ Phase I chemicals was assessed using a zebrafish screen for developmental toxicity. All exposures were by immersion from 6-8 h post fertilization (hpf) to 5 days post fertilization (dpf); nominal concentration range of 1 nM-80 μM. On 6 dpf larvae were assessed for death and overt structural defects. Results revealed that the majority (62%) of chemicals were toxic to the developing zebrafish; both toxicity incidence and potency was correlated with chemical class and hydrophobicity (logP); and inter-and intra-plate replicates showed good agreement. The zebrafish embryo screen, by providing an integrated model of the developing vertebrate, compliments the ToxCast assay portfolio and has the potential to provide information relative to overt and organismal toxicity.


Journal of Toxicology and Environmental Health-part B-critical Reviews | 2010

Xenobiotic-metabolizing enzyme and transporter gene expression in primary cultures of human hepatocytes modulated by ToxCast chemicals.

Daniel M. Rotroff; Andrew L. Beam; David J. Dix; Adam M. Farmer; Kimberly M. Freeman; Keith A. Houck; Richard S. Judson; Edward L. LeCluyse; Matthew T. Martin; David M. Reif; Stephen S. Ferguson

Primary human hepatocyte cultures are useful in vitro model systems of human liver because when cultured under appropriate conditions the hepatocytes retain liver-like functionality such as metabolism, transport, and cell signaling. This model system was used to characterize the concentration- and time-response of the 320 ToxCast chemicals for changes in expression of genes regulated by nuclear receptors. Fourteen gene targets were monitored in quantitative nuclease protection assays: six representative cytochromes P-450, four hepatic transporters, three Phase II conjugating enzymes, and one endogenous metabolism gene involved in cholesterol synthesis. These gene targets are sentinels of five major signaling pathways: AhR, CAR, PXR, FXR, and PPARα. Besides gene expression, the relative potency and efficacy for these chemicals to modulate cellular health and enzymatic activity were assessed. Results demonstrated that the culture system was an effective model of chemical-induced responses by prototypical inducers such as phenobarbital and rifampicin. Gene expression results identified various ToxCast chemicals that were potent or efficacious inducers of one or more of the 14 genes, and by inference the 5 nuclear receptor signaling pathways. Significant relative risk associations with rodent in vivo chronic toxicity effects are reported for the five major receptor pathways. These gene expression data are being incorporated into the larger ToxCast predictive modeling effort.


JAMA | 2016

Translating Artificial Intelligence Into Clinical Care

Andrew L. Beam; Isaac S. Kohane

Artificial intelligence has become a frequent topic in the news cycle, with reports of breakthroughs in speech recognition, computer vision, and textual understanding that have made their way into a bevy of products and services that are used every day. In contrast, clinical care has yet to reach the much lower bar of automating health care information transactions in the form of electronic health records. Medical leaders in the 1960s and 1970s were already speculating about the opportunities to bring automated inference methods to patient care,1 but the methods and data had not yet reached the critical mass needed to achieve those goals. The intellectual roots of “deep learning,” which power the commodity and consumer implementations of presentday artificial intelligence, were planted even earlier in the 1940s and 1950s with the development of “artificial neural network” algorithms.2,3 These algorithms, as their name suggests, are very loosely based on the way in which the brain’s web of neurons adaptively becomes rewired in response to external stimuli to perform learning and pattern recognition. Even though these methods have had many success stories over the past 70 years, their performance and adoption in medicine in the past 5 years has seen a quantum leap. The catalyzing event occurred in 2012 when a team of researchers from the University of Toronto reduced the error rate in half on a well-known computer vision challenge using a deep learning algorithm.4 This work rapidly accelerated research and development in deep learning and propelled the field forward at a staggering pace. With the increased availability of digital clinical data, it remains to be seen how these deep learning models might be applied to the medical domain. In this issue of JAMA, Gulshan and colleagues5 present findings from a study evaluating the use of deep learning for detection of diabetic retinopathy and macular edema. To build their model, the authors collected 128 175 annotated images from the EyePACs database. Each image was rated by 3 to 7 clinicians for referable diabetic retinopathy, diabetic macular edema, and overall image quality. Each rater was selected from a panel of 54 board-certified ophthalmologists and senior ophthalmology residents. Using this data set, the algorithm learned to predict the consensus grade of the raters along each clinical attribute: referable diabetic retinopathy, diabetic macular edema, and image quality. To validate their algorithm, the authors assessed its performance on 2 separate and nonoverlapping data sets consisting of 9963 and 1748 images. On the validation data, the algorithm had high sensitivity and specificity. Only one of these values (sensitivity on the second validation data set) failed to be superior at a statistically significant level. The other performance metrics (eg, area under the receiver operating characteristic curve, negative predictive value, positive predictive value) were likewise impressive, giving the authors confidence that this algorithm could be of clinical utility. This work closely mirrors a recent “Kaggle” contest in which 661 teams competed to build an algorithm to predict the grade of diabetic retinopathy, albeit on a smaller data set with fewer grades per image. Kaggle is a website that hosts machine learning and data science contests. Companies and researchers can post their data to Kaggle and have contestants from around the world build predictive models. In the diabetic retinopathy contest, nearly all of the top teams used some form of deep learning and had little to no knowledge of the eye or ophthalmology. The first-place team6 and secondplace team7 both used standard deep learning models and were data science practitioners, not medical professionals. Gulshan et al correctly pointed out that a prerequisite for a successful deep learning model is access to a large database of images with high-quality annotations. Accordingly, the investigators increased both the number of images available and the number of ratings per image, which allowed them to improve on the existing state of the art with respect to both Kaggle and the existing scientific literature. To build their algorithm, Gulshan et al leveraged a workhorse model in deep learning known as a convolutional neural network that has been critically important to recent advances in automatic image recognition. The convolutional neural network model used by the authors is known as the Inception-V3 network,8 which was developed by Google for entry in the Large Scale Visual Recognition Challenge, which it won in 2014. In this contest, known as ImageNet,9 researchers were given 1.2 million images that involve 1000 different categories that cover a wide variety of everyday objects, such as cats, dogs, automobiles, and different kinds of food. The goal of the contest was to build a classifier that could automatically recognize which object was present in an image and to identify which region of the image contained the object. This challenge was broad so that it covered many types of objects that a computer vision system could encounter in the real world. As a result of this contest, several techniques10-12 have been pioneered that improved the accuracy of these models immensely. As with the study by Gulshan et al, these improvements are beginning to trickle into other areas of computer vision, including medical image processing. For example, Gulshan et al not only used the same network that was originally built for ImageNet, they also used that network Editorial and Viewpoint


JAMA | 2018

Big Data and Machine Learning in Health Care

Andrew L. Beam; Isaac S. Kohane

Nearly all aspects of modern life are in some way being changed by big data and machine learning. Netflix knows what movies people like to watch and Google knows what people want to know based on their search histories. Indeed, Google has recently begun to replace much of its existing non–machine learning technology with machine learning algorithms, and there is great optimism that these techniques can provide similar improvements across many sectors. It isnosurprisethenthatmedicineisawashwithclaims of revolution from the application of machine learning to big health care data. Recent examples have demonstrated that big data and machine learning can create algorithms that perform on par with human physicians.1 Though machine learning and big data may seem mysterious at first, they are in fact deeply related to traditional statistical models that are recognizable to most clinicians. It is our hope that elucidating these connections will demystify these techniques and provide a set of reasonable expectations for the role of machine learning and big data in health care. Machine learning was originally described as a program that learns to perform a task or make a decision automatically from data, rather than having the behavior explicitlyprogrammed.However,thisdefinitionisverybroad and could cover nearly any form of data-driven approach. For instance, consider the Framingham cardiovascular risk score,whichassignspointstovariousfactorsandproduces a number that predicts 10-year cardiovascular risk. Should this be considered an example of machine learning? The answer might obviously seem to be no. Closer inspection oftheFraminghamriskscorerevealsthattheanswermight not be as obvious as it first seems. The score was originally created2 by fitting a proportional hazards model to data frommorethan5300patients,andsothe“rule”wasinfact learnedentirelyfromdata.Designatingariskscoreasamachine learning algorithm might seem a strange notion, but this example reveals the uncertain nature of the original definition of machine learning. It is perhaps more useful to imagine an algorithm as existing along a continuum between fully human-guided vs fully machine-guided data analysis. To understand the degree to which a predictive or diagnostic algorithm can said to be an instance of machine learning requires understanding how much of its structure or parameters were predetermined by humans. The trade-off between human specificationofapredictivealgorithm’spropertiesvslearning those properties from data is what is known as the machine learning spectrum. Returning to the Framingham study, to create the original risk score statisticians and clinical experts worked together to make many important decisions, such as which variables to include in the model, therelationshipbetweenthedependentandindependent variables, and variable transformations and interactions. Since considerable human effort was used to define these properties, it would place low on the machine learning spectrum (#19 in the Figure and Supplement). Many evidence-based clinical practices are based on a statistical model of this sort, and so many clinical decisions in fact exist on the machine learning spectrum (middle left of Figure). On the extreme low end of the machine learning spectrum would be heuristics and rules of thumb that do not directly involve the use of any rules or models explicitly derived from data (bottom left of Figure). Suppose a new cardiovascular risk score is created that includes possible extensions to the original model. For example, it could be that risk factors should not be added but instead should be multiplied or divided, or perhaps a particularly important risk factor should square the entire score if it is present. Moreover, if it is not known in advance which variables will be important, but thousands of individual measurements have been collected, how should a good model be identified from among the infinite possibilities? This is precisely what a machine learning algorithm attempts to do. As humans impose fewer assumptions on the algorithm, it moves further up the machine learning spectrum. However, there is never a specific threshold wherein a model suddenly becomes “machine learning”; rather, all of these approaches exist along a continuum, determined by how many human assumptions are placed onto the algorithm. An example of an approach high on the machine learning spectrum has recently emerged in the form of so-called deep learning models. Deep learning models are stunningly complex networks of artificial neurons that were designed expressly to create accurate models directly from raw data. Researchers recently demonstrated a deep learning algorithm capable of detecting diabetic retinopathy (#4 in the Figure, top center) from retinal photographs at a sensitivity equal to or greater than that of ophthalmologists.1 This model learned the diagnosis procedure directly from the raw pixels of the images with no human intervention outside of a team of ophthalmologists who annotated each image with the correct diagnosis. Because they are able to learn the task with little human instruction or prior assumptions, these deep learning algorithms rank very high on the machine learning spectrum (Figure, light blue circles). Though they require less human guidance, deep learning algorithms for image recognition require enormous amounts of data to capture the full complexity, variety, and nuance inherent to real-world images. Consequently, these algorithms often require hundreds of thousands of examples to extract the salient image features that are correlated with the outcome of interest. Higher placement on the machine learning spectrum does not imply superiority, because different tasks require different levels of human involvement. While algorithms high on the spectrum are often very flexible and can learn many tasks, they are often uninterpretable VIEWPOINT


BMC Bioinformatics | 2014

Bayesian neural networks for detecting epistasis in genetic association studies

Andrew L. Beam; Alison A. Motsinger-Reif; Jon Doyle

BackgroundDiscovering causal genetic variants from large genetic association studies poses many difficult challenges. Assessing which genetic markers are involved in determining trait status is a computationally demanding task, especially in the presence of gene-gene interactions.ResultsA non-parametric Bayesian approach in the form of a Bayesian neural network is proposed for use in analyzing genetic association studies. Demonstrations on synthetic and real data reveal they are able to efficiently and accurately determine which variants are involved in determining case-control status. By using graphics processing units (GPUs) the time needed to build these models is decreased by several orders of magnitude. In comparison with commonly used approaches for detecting interactions, Bayesian neural networks perform very well across a broad spectrum of possible genetic relationships.ConclusionsThe proposed framework is shown to be a powerful method for detecting causal SNPs while being computationally efficient enough to handle large datasets.


Journal of Computational and Graphical Statistics | 2016

Fast Hamiltonian Monte Carlo Using GPU Computing

Andrew L. Beam; Sujit K. Ghosh; Jon Doyle

In recent years, the Hamiltonian Monte Carlo (HMC) algorithm has been found to work more efficiently compared to other popular Markov chain Monte Carlo (MCMC) methods (such as random walk Metropolis–Hastings) in generating samples from a high-dimensional probability distribution. HMC has proven more efficient in terms of mixing rates and effective sample size than previous MCMC techniques, but still may not be sufficiently fast for particularly large problems. The use of GPUs promises to push HMC even further greatly increasing the utility of the algorithm. By expressing the computationally intensive portions of HMC (the evaluations of the probability kernel and its gradient) in terms of linear or element-wise operations, HMC can be made highly amenable to the use of graphics processing units (GPUs). A multinomial regression example demonstrates the promise of GPU-based HMC sampling. Using GPU-based memory objects to perform the entire HMC simulation, most of the latency penalties associated with transferring data from main to GPU memory can be avoided. Thus, the proposed computational framework may appear conceptually very simple, but has the potential to be applied to a wide class of hierarchical models relying on HMC sampling. Models whose posterior density and corresponding gradients can be reduced to linear or element-wise operations are amenable to significant speed ups through the use of GPUs. Analyses of datasets that were previously intractable for fully Bayesian approaches due to the prohibitively high computational cost are now feasible using the proposed framework.


BMJ | 2018

Postsurgical prescriptions for opioid naive patients and association with overdose and misuse: retrospective cohort study

Gabriel Brat; Denis Agniel; Andrew L. Beam; Brian K. Yorkgitis; Mark C. Bicket; Mark L. Homer; Kathe Fox; Daniel Knecht; Cheryl N. McMahill-Walraven; Nathan Palmer; Isaac S. Kohane

Abstract Objective To quantify the effects of varying opioid prescribing patterns after surgery on dependence, overdose, or abuse in an opioid naive population. Design Retrospective cohort study. Setting Surgical claims from a linked medical and pharmacy administrative database of 37 651 619 commercially insured patients between 2008 and 2016. Participants 1 015 116 opioid naive patients undergoing surgery. Main outcome measures Use of oral opioids after discharge as defined by refills and total dosage and duration of use. The primary outcome was a composite of misuse identified by a diagnostic code for opioid dependence, abuse, or overdose. Results 568 612 (56.0%) patients received postoperative opioids, and a code for abuse was identified for 5906 patients (0.6%, 183 per 100 000 person years). Total duration of opioid use was the strongest predictor of misuse, with each refill and additional week of opioid use associated with an adjusted increase in the rate of misuse of 44.0% (95% confidence interval 40.8% to 47.2%, P<0.001), and 19.9% increase in hazard (18.5% to 21.4%, P<0.001), respectively. Conclusions Each refill and week of opioid prescription is associated with a large increase in opioid misuse among opioid naive patients. The data from this study suggest that duration of the prescription rather than dosage is more strongly associated with ultimate misuse in the early postsurgical period. The analysis quantifies the association of prescribing choices on opioid misuse and identifies levers for possible impact.


Journal of Pharmacogenomics and Pharmacoproteomics | 2014

Beyond IC50s: Towards Robust Statistical Methods for in vitro Association Studies

Andrew L. Beam; Alison A. Motsinger-Reif

Cell line cytotoxicity assays have become increasingly popular approaches for genetic and genomic studies of differential cytotoxic response. There are an increasing number of success stories, but relatively little evaluation of the statistical approaches used in such studies. In the vast majority of these studies, concentration response is summarized using curve-fitting approaches, and then summary measure(s) are used as the phenotype in subsequent genetic association studies. The curve is usually summarized by a single parameter such as the curve’s inflection point (e.g. the EC/IC50). Such modeling makes major assumptions and has statistical limitations that should be considered. In the current review, we discuss the limitations of the EC/IC50 as a phenotype in association studies, and highlight some potential limitations with a simulation experiment. Finally, we discuss some alternative analysis approaches that have been shown to be more robust.


Scientific Reports | 2017

Predictive Modeling of Physician-Patient Dynamics That Influence Sleep Medication Prescriptions and Clinical Decision-Making

Andrew L. Beam; Uri Kartoun; Jennifer K. Pai; Arnaub K. Chatterjee; Timothy Fitzgerald; Stanley Y. Shaw; Isaac S. Kohane

Insomnia remains under-diagnosed and poorly treated despite its high economic and social costs. Though previous work has examined how patient characteristics affect sleep medication prescriptions, the role of physician characteristics that influence this clinical decision remains unclear. We sought to understand patient and physician factors that influence sleep medication prescribing patterns by analyzing Electronic Medical Records (EMRs) including the narrative clinical notes as well as codified data. Zolpidem and trazodone were the most widely prescribed initial sleep medication in a cohort of 1,105 patients. Some providers showed a historical preference for one medication, which was highly predictive of their future prescribing behavior. Using a predictive model (AUC = 0.77), physician preference largely determined which medication a patient received (OR = 3.13; p = 3 × 10−37). In addition to the dominant effect of empirically determined physician preference, discussion of depression in a patient’s note was found to have a statistically significant association with receiving a prescription for trazodone (OR = 1.38, p = 0.04). EMR data can yield insights into physician prescribing behavior based on real-world physician-patient interactions.


Dose-response | 2011

Optimization of Nonlinear Dose- and Concentration-Response Models Utilizing Evolutionary Computation

Andrew L. Beam; Alison A. Motsinger-Reif

An essential part of toxicity and chemical screening is assessing the concentrated related effects of a test article. Most often this concentration-response is a nonlinear, necessitating sophisticated regression methodologies. The parameters derived from curve fitting are essential in determining a test articles potency (EC50) and efficacy (Emax) and variations in model fit may lead to different conclusions about an articles performance and safety. Previous approaches have leveraged advanced statistical and mathematical techniques to implement nonlinear least squares (NLS) for obtaining the parameters defining such a curve. These approaches, while mathematically rigorous, suffer from initial value sensitivity, computational intensity, and rely on complex and intricate computational and numerical techniques. However if there is a known mathematical model that can reliably predict the data, then nonlinear regression may be equally viewed as parameter optimization. In this context, one may utilize proven techniques from machine learning, such as evolutionary algorithms, which are robust, powerful, and require far less computational framework to optimize the defining parameters. In the current study we present a new method that uses such techniques, Evolutionary Algorithm Dose Response Modeling (EADRM), and demonstrate its effectiveness compared to more conventional methods on both real and simulated data.

Collaboration


Dive into the Andrew L. Beam's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alison A. Motsinger-Reif

North Carolina State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jon Doyle

North Carolina State University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge