Douglas P. Hill
Dartmouth College
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Douglas P. Hill.
Archive | 2013
Jason H. Moore; Douglas P. Hill; Arvis Sulovari; La Creis R. Kidd
Given infinite time, humans would progress through modeling complex data in a manner that is dependent on prior expert knowledge. The goal of the present study is make extensions and enhancements to a computational evolution system (CES) that has the ultimate objective of tinkering with data as a human would. This is accomplished by providing flexibility in the model-building process and a meta-layer that learns how to generate better models. The key to the CES system is the ability to identify and exploit expert knowledge from biological databases or prior analytical results. Our prior results have demonstrated that CES is capable of efficiently navigating these large and rugged fitness landscapes toward the discovery of biologically meaningful genetic models of disease. Further, we have shown that the efficacy of CES is improved dramatically when the system is provided with statistical or biological expert knowledge. The goal of the present study was to apply CES to the genetic analysis of prostate cancer aggressiveness in a large sample of European Americans. We introduce here the use of Pareto-optimization to help address overfitting in the learning system. We further introduce a post-processing step that uses hierarchical cluster analysis to generate expert knowledge from the landscape of best models and their predictions across patients. We find that the combination of Pareto-optimization and post-processing of results greatly improves the genetic analysis of prostate cancer.
Archive | 2010
Casey S. Greene; Douglas P. Hill; Jason H. Moore
The relationship between interindividual variation in our genomes and variation in our susceptibility to common diseases is expected to be complex with multiple interacting genetic factors. A central goal of human genetics is to identify which DNA sequence variations predict disease risk in human populations. Our success in this endeavour will depend critically on the development and implementation of computational intelligence methods that are able to embrace, rather than ignore, the complexity of the genotype to phenotype relationship. To this end, we have developed a computational evolution system (CES) to discover genetic models of disease susceptibility involving complex relationships between DNA sequence variations. The CES approach is hierarchically organized and is capable of evolving operators of any arbitrary complexity. The ability to evolve operators distinguishes this approach from artificial evolution approaches using fixed operators such as mutation and recombination. Our previous studies have shown that a CES that can utilize expert knowledge about the problem in evolved operators significantly outperforms a CES unable to use this knowledge. This environmental sensing of external sources of biological or statistical knowledge is important when the search space is both rugged and large as in the genetic analysis of complex diseases. We show here that the CES is also capable of evolving operators which exploit one of several sources of expert knowledge to solve the problem. This is important for both the discovery of highly fit genetic models and because the particular source of expert knowledge used by evolved operators may provide additional information about the problem itself. This study brings us a step closer to a CES that can solve complex problems in human genetics in addition to discovering genetic models of disease.
GPTP | 2014
Jason H. Moore; Douglas P. Hill; Andrew J. Saykin; Li Shen
Susceptibility to Alzheimer’s disease is likely due to complex interaction among many genetic and environmental factors. Identifying complex genetic effects in large data sets will require computational methods that extend beyond what parametric statistical methods such as logistic regression can provide. We have previously introduced a computational evolution system (CES) that uses genetic programming (GP) to represent genetic models of disease and to search for optimal models in a rugged fitness landscape that is effectively infinite in size. The CES approach differs from other GP approaches in that it is able to learn how to solve the problem by generating its own operators. A key feature is the ability for the operators to use expert knowledge to guide the stochastic search. We have previously shown that CES is able to discover nonlinear genetic models of disease susceptibility in both simulated and real data. The goal of the present study was to introduce a measure of interestingness into the modeling process. Here, we define interestingness as a measure of non-additive gene-gene interactions. That is, we are more interested in those CES models that include attributes that exhibit synergistic effects on disease risk. To implement this new feature we first pre-processed the data to measure all pairwise gene-gene interaction effects using entropy-based methods. We then provided these pre-computed measures to CES as expert knowledge and as one of three fitness criteria in three-dimensional Pareto optimization. We applied this new CES algorithm to an Alzheimer’s disease data set with approximately 520,000 genetic attributes. We show that this approach discovers more interesting models with the added benefit of improving classification accuracy. This study demonstrates the applicability of CES to genome-wide genetic analysis using expert knowledge derived from measures of interestingness.
Archive | 2011
Jason H. Moore; Douglas P. Hill; Jonathan M. Fisher; Nicole A. Lavender; La Creis R. Kidd
The paradigm of identifying genetic risk factors for common human diseases by analyzing one DNA sequence variation at a time is quickly being replaced by research strategies that embrace the multivariate complexity of the genotype to phenotype mapping relationship that is likely due, in part, to nonlinear interactions among many genetic and environmental factors. Embracing the complexity of common diseases such as cancer requires powerful computational methods that are able to model nonlinear interactions in high-dimensional genetic data. Previously, we have addressed this challenge with the development of a computational evolution system (CES) that incorporates greater biological realism than traditional artificial evolution methods, such as genetic programming. Our results have demonstrated that CES is capable of efficiently navigating these large and rugged fitness landscapes toward the discovery of biologically meaningful genetic models of disease predisposition. Further, we have shown that the efficacy of CES is improved dramatically when the system is provided with statistical expert knowledge, derived from a family of machine learning techniques known as Relief, or biological expert knowledge, derived from sources such as protein-protein interaction databases. The goal of the present study was to apply CES to the genetic analysis of prostate cancer aggressiveness in a large sample of European Americans. We introduce here the use of 3D visualization methods to identify interesting patterns in CES results. Information extracted from the visualization through human-computer interaction are then provide as expert knowledge to newCES runs in a cascading framework. We present aCES-derived multivariate classifier and provide a statistical and biological interpretation in the context of prostate cancer prediction. The incorporation of human-computer interaction into CES provides a first step towards an interactive discovery system where the experts can be embedded in the computational discovery process. Our working hypothesis is that this type of human-computer interaction will provide more useful results for complex problem solving than the traditional black box machine learning approach.
Biodata Mining | 2015
Talia L. Weiss; Amanda Zieselman; Douglas P. Hill; Solomon G. Diamond; Li Shen; Andrew J. Saykin; Jason H. Moore
BackgroundBiological data mining is a powerful tool that can provide a wealth of information about patterns of genetic and genomic biomarkers of health and disease. A potential disadvantage of data mining is volume and complexity of the results that can often be overwhelming. It is our working hypothesis that visualization methods can greatly enhance our ability to make sense of data mining results. More specifically, we propose that 3-D printing has an important role to play as a visualization technology in biological data mining. We provide here a brief review of 3-D printing along with a case study to illustrate how it might be used in a research setting.ResultsWe present as a case study a genetic interaction network associated with grey matter density, an endophenotype for late onset Alzheimer’s disease, as a physical model constructed with a 3-D printer. The synergy or interaction effects of multiple genetic variants were represented through a color gradient of the physical connections between nodes. The digital gene-gene interaction network was then 3-D printed to generate a physical network model.ConclusionsThe physical 3-D gene-gene interaction network provided an easily manipulated, intuitive and creative way to visualize the synergistic relationships between the genetic variants and grey matter density in patients with late onset Alzheimer’s disease. We discuss the advantages and disadvantages of this novel method of biological data mining visualization.
Archive | 2011
Kristine A. Pattin; Joshua L. Payne; Douglas P. Hill; Thomas Caldwell; Jonathan M. Fisher; Jason H. Moore
The etiology of common human disease often involves a complex genetic architecture, where numerous points of genetic variation interact to influence disease susceptibility. Automating the detection of such epistatic genetic risk factors poses a major computational challenge, as the number of possible gene-gene interactions increases combinatorially with the number of sequence variations. Previously, we addressed this challenge with the development of a computational evolution system (CES) that incorporates greater biological realism than traditional artificial evolution methods. Our results demonstrated that CES is capable of efficiently navigating these large and rugged epistatic landscapes toward the discovery of biologically meaningful genetic models of disease predisposition. Further, we have shown that the efficacy of CES is improved dramatically when the system is provided with statistical expert knowledge. We anticipate that biological expert knowledge, such as genetic regulatory or protein-protein interaction maps, will provide complementary information, and further improve the ability of CES to model the genetic architectures of common human disease. The goal of this study is to test this hypothesis, utilizing publicly available protein-protein interaction information. We show that by incorporating this source of expert knowledge, the system is able to identify functional interactions that represent more concise models of disease susceptibility with improved accuracy. Our ability to incorporate biological knowledge into learning algorithms is an essential step toward the routine use of methods such as CES for identifying genetic risk factors for common human diseases.
Archive | 2015
Jason H. Moore; Casey S. Greene; Douglas P. Hill
The genetic basis for primary open-angle glaucoma (POAG) is not yet understood but is likely the result of many interacting genetic variants that influence risk in the context of our local ecology. The complexity of the genotype to phenotype mapping relationship for common diseases like POAG necessitates analytical approaches that move beyond parametric statistical methods such as logistic regression that assume a particular mathematical model. This is particularly important in the era of big data where it is routine to collect and analyze data sets with hundreds of thousands of measured genetic variants in thousands of human subjects. We introduce here the Exploratory Modeling for Extracting Relationships using Genetic and Evolutionary Navigation Techniques (EMERGENT) algorithm as an artificial intelligence approach to the genetic analysis of common human diseases. EMERGENT builds models of genetic variation from lists of mathematical functions using a form of genetic programming called computational evolution. A key feature of the system is the ability to utilize pre-processed expert knowledge giving it the ability to explore model space much as a human would. We describe this system in detail and then apply it to the genetic analysis of POAG in the Glaucoma Gene Environment Initiative (GLAUGEN) study that included approximately 1,272 subjects with the disease and 1057 healthy controls. A total of 657,366 single-nucleotide polymorphisms (SNPs) from across the human genome were measured in these subjects and available for analysis. Analysis using the EMERGENT framework revealed a best model consisting of six SNPs that map to at least six different genes. Two of these genes have previously been associated with POAG in several studies. The others represent new hypotheses about the genetic basis of POAG. All of the SNPs are involved in non-additive gene-gene interactions. Further, the six genes are all directly or indirectly related through biological interactions to the vascular endothelial growth factor (VEGF) gene that is an actively investigated drug target for POAG. This study demonstrates the routine application of an artificial intelligence-based system for the genetic analysis of complex human diseases.
Journal of Bone and Mineral Research | 2018
Jennifer C. Kelley; Nicolas Stettler‐Davis; Mary B. Leonard; Douglas P. Hill; Justine Shults; Virginia A. Stallings; Berkowitz Rj; Melissa S. Xanthopoulos; Elizabeth Prout‐Parks; Sarah B. Klieger; Babette S. Zemel
Obese adolescents have increased fracture risk, but effects of alterations in adiposity on bone accrual and strength in obese adolescents are not understood. We evaluated 12‐month changes in trabecular and cortical volumetric bone mineral density (vBMD) and cortical geometry in obese adolescents undergoing a randomized weight management program, and investigated the effect of body composition changes on bone outcomes. Peripheral quantitative computed tomography (pQCT) of the radius and tibia, and whole‐body dual‐energy X‐ray absorptiometry (DXA) scans were obtained at baseline, 6 months, and 12 months in 91 obese adolescents randomized to standard care versus behavioral intervention for weight loss. Longitudinal models assessed effects of body composition changes on bone outcomes, adjusted for age, bone length, and African‐American ancestry, and stratified by sex. Secondary analyses included adjustment for physical activity, maturation, vitamin D, and inflammatory biomarkers. Baseline body mass index (BMI) was similar between intervention groups. Twelve‐month change in BMI in the standard care group was 1.0 kg/m2 versus –0.4 kg/m2 in the behavioral intervention group (p < 0.01). Intervention groups were similar in bone outcomes, so they were combined for subsequent analyses. For the tibia, BMI change was not associated with change in vBMD or structure. Greater baseline lean body mass index (LBMI) associated with higher cortical vBMD in males, trabecular vBMD in females, and polar section modulus (pZ) and periosteal circumference (Peri‐C) in both sexes. In females, change in LBMI positively associated with gains in pZ and Peri‐C. Baseline visceral adipose tissue (VFAT) was inversely associated with pZ in males and cortical vBMD in females. Change in VFAT did not affect bone outcomes. For the radius, BMI and LBMI changes positively associated with pZ in males. Thus, in obese adolescents, weight loss intervention with modest changes in BMI was not detrimental to radius or tibia bone strength, and changes in lean, but not adiposity, measures were beneficial to bone development.
pacific symposium on biocomputing | 2011
Jason H. Moore; Richard Cowper-Sallari; Douglas P. Hill; Patricia L. Hibberd; Juliette C. Madan
genetic and evolutionary computation conference | 2009
Casey S. Greene; Douglas P. Hill; Jason H. Moore