Shyam Visweswaran
University of Pittsburgh
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Shyam Visweswaran.
BMC Bioinformatics | 2011
Xia Jiang; Richard E. Neapolitan; M. Michael Barmada; Shyam Visweswaran
BackgroundGene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machine-learning and data mining methods have been developed for learning epistatic relationships from data. A well-known combinatorial method that has been successfully applied for detecting epistasis is Multifactor Dimensionality Reduction (MDR). Jiang et al. created a combinatorial epistasis learning method called BNMBL to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. Each of these data sets was generated from a model that associates two SNPs with a disease and includes 18 unrelated SNPs. For each data set, BNMBL and MDR were used to score all 2-SNP models, and BNMBL learned significantly more correct models. In real data sets, we ordinarily do not know the number of SNPs that influence phenotype. BNMBL may not perform as well if we also scored models containing more than two SNPs. Furthermore, a number of other BN scoring criteria have been developed. They may detect epistatic interactions even better than BNMBL.Although BNs are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of SNPs in that model.ResultsWe evaluated the performance of 22 BN scoring criteria using 28,000 simulated data sets and a real Alzheimers GWAS data set. Our results were surprising in that the Bayesian scoring criterion with large values of a hyperparameter called α performed best. This score performed better than other BN scoring criteria and MDR at recall using simulated data sets, at detecting the hardest-to-detect models using simulated data sets, and at substantiating previous results using the real Alzheimers data set.ConclusionsWe conclude that representing epistatic interactions using BN models and scoring them using a BN scoring criterion holds promise for identifying epistatic genetic variants in data. In particular, the Bayesian scoring criterion with large values of a hyperparameter α appears more promising than a number of alternatives.
Journal of Biomedical Informatics | 2013
Milos Hauskrecht; Iyad Batal; Michal Valko; Shyam Visweswaran; Gregory F. Cooper; Gilles Clermont
We develop and evaluate a data-driven approach for detecting unusual (anomalous) patient-management decisions using past patient cases stored in electronic health records (EHRs). Our hypothesis is that a patient-management decision that is unusual with respect to past patient care may be due to an error and that it is worthwhile to generate an alert if such a decision is encountered. We evaluate this hypothesis using data obtained from EHRs of 4486 post-cardiac surgical patients and a subset of 222 alerts generated from the data. We base the evaluation on the opinions of a panel of experts. The results of the study support our hypothesis that the outlier-based alerting can lead to promising true alert rates. We observed true alert rates that ranged from 25% to 66% for a variety of patient-management actions, with 66% corresponding to the strongest outliers.
Genetic Epidemiology | 2010
Xia Jiang; M. Michael Barmada; Shyam Visweswaran
It is believed that interactions among genes (epistasis) may play an important role in susceptibility to common diseases (Moore and Williams [2002]. Ann Med 34:88–95; Ritchie et al. [2001]. Am J Hum Genet 69:138–147). To study the underlying genetic variants of diseases, genome‐wide association studies (GWAS) that simultaneously assay several hundreds of thousands of SNPs are being increasingly used. Often, the data from these studies are analyzed with single‐locus methods (Lambert et al. [2009]. Nat Genet 41:1094–1099; Reiman et al. [2007]. Neuron 54:713–720). However, epistatic interactions may not be easily detected with single‐locus methods (Marchini et al. [2005]. Nat Genet 37:413–417). As a result, both parametric and nonparametric multi‐locus methods have been developed to detect such interactions (Heidema et al. [2006]. BMC Genet 7:23). However, efficiently analyzing epistasis using high‐dimensional genome‐wide data remains a crucial challenge. We develop a method based on Bayesian networks and the minimum description length principle for detecting epistatic interactions. We compare its ability to detect gene‐gene interactions and its efficiency to that of the combinatorial method multifactor dimensionality reduction (MDR) using 28,000 simulated data sets generated from 70 different genetic models We further apply the method to over 300,000 SNPs obtained from a GWAS involving late onset Alzheimers disease (LOAD). Our method outperforms MDR and we substantiate previous results indicating that the GAB2 gene is associated with LOAD. To our knowledge, this is the first successful model‐based epistatic analysis using a high‐dimensional genome‐wide data set. Genet. Epidemiol. 34:575–581, 2010.
Journal of the American Medical Informatics Association | 2011
Wei Wei; Shyam Visweswaran; Gregory F. Cooper
OBJECTIVE Predicting patient outcomes from genome-wide measurements holds significant promise for improving clinical care. The large number of measurements (eg, single nucleotide polymorphisms (SNPs)), however, makes this task computationally challenging. This paper evaluates the performance of an algorithm that predicts patient outcomes from genome-wide data by efficiently model averaging over an exponential number of naive Bayes (NB) models. DESIGN This model-averaged naive Bayes (MANB) method was applied to predict late onset Alzheimers disease in 1411 individuals who each had 312,318 SNP measurements available as genome-wide predictive features. Its performance was compared to that of a naive Bayes algorithm without feature selection (NB) and with feature selection (FSNB). MEASUREMENT Performance of each algorithm was measured in terms of area under the ROC curve (AUC), calibration, and run time. RESULTS The training time of MANB (16.1 s) was fast like NB (15.6 s), while FSNB (1684.2 s) was considerably slower. Each of the three algorithms required less than 0.1 s to predict the outcome of a test case. MANB had an AUC of 0.72, which is significantly better than the AUC of 0.59 by NB (p<0.00001), but not significantly different from the AUC of 0.71 by FSNB. MANB was better calibrated than NB, and FSNB was even better in calibration. A limitation was that only one dataset and two comparison algorithms were included in this study. CONCLUSION MANB performed comparatively well in predicting a clinical outcome from a high-dimensional genome-wide dataset. These results provide support for including MANB in the methods used to predict outcomes from large, genome-wide datasets.
International Journal of Medical Informatics | 2011
Sandra L. Kane-Gill; Shyam Visweswaran; Melissa I. Saul; An-Kwok Ian Wong; Louis E. Penrod; Steven M. Handler
OBJECTIVE Clinical event monitors are a type of active medication monitoring system that can use signals to alert clinicians to possible adverse drug reactions. The primary goal was to evaluate the positive predictive values of select signals used to automate the detection of ADRs in the medical intensive care unit. METHOD This is a prospective, case series of adult patients in the medical intensive care unit during a six-week period who had one of five signals presents: an elevated blood urea nitrogen, vancomycin, or quinidine concentration, or a low sodium or glucose concentration. Alerts were assessed using 3 objective published adverse drug reaction determination instruments. An event was considered an adverse drug reaction when 2 out of 3 instruments had agreement of possible, probable or definite. Positive predictive values were calculated as the proportion of alerts that occurred, divided by the number of times that alerts occurred and adverse drug reactions were confirmed. RESULTS 145 patients were eligible for evaluation. For the 48 patients (50% male) having an alert, the mean±SD age was 62±19 years. A total of 253 alerts were generated. Positive predictive values were 1.0, 0.55, 0.38 and 0.33 for vancomycin, glucose, sodium, and blood urea nitrogen, respectively. A quinidine alert was not generated during the evaluation. CONCLUSIONS Computerized clinical event monitoring systems should be considered when developing methods to detect adverse drug reactions as part of intensive care unit patient safety surveillance systems, since they can automate the detection of these events using signals that have good performance characteristics by processing commonly available laboratory and medication information.
BMC Bioinformatics | 2011
Jonathan L. Lustgarten; Shyam Visweswaran; Vanathi Gopalakrishnan; Gregory F. Cooper
BackgroundSeveral data mining methods require data that are discrete, and other methods often perform better with discrete data. We introduce an efficient Bayesian discretization (EBD) method for optimal discretization of variables that runs efficiently on high-dimensional biomedical datasets. The EBD method consists of two components, namely, a Bayesian score to evaluate discretizations and a dynamic programming search procedure to efficiently search the space of possible discretizations. We compared the performance of EBD to Fayyad and Iranis (FI) discretization method, which is commonly used for discretization.ResultsOn 24 biomedical datasets obtained from high-throughput transcriptomic and proteomic studies, the classification performances of the C4.5 classifier and the naïve Bayes classifier were statistically significantly better when the predictor variables were discretized using EBD over FI. EBD was statistically significantly more stable to the variability of the datasets than FI. However, EBD was less robust, though not statistically significantly so, than FI and produced slightly more complex discretizations than FI.ConclusionsOn a range of biomedical datasets, a Bayesian discretization method (EBD) yielded better classification performance and stability but was less robust than the widely used FI discretization method. The EBD discretization method is easy to implement, permits the incorporation of prior knowledge and belief, and is sufficiently fast for application to high-dimensional data.
Journal of Biomedical Informatics | 2010
Shyam Visweswaran; Derek C. Angus; Margaret Hsieh; Lisa A. Weissfeld; Donald M. Yealy; Gregory F. Cooper
We introduce an algorithm for learning patient-specific models from clinical data to predict outcomes. Patient-specific models are influenced by the particular history, symptoms, laboratory results, and other features of the patient case at hand, in contrast to the commonly used population-wide models that are constructed to perform well on average on all future cases. The patient-specific algorithm uses Markov blanket (MB) models, carries out Bayesian model averaging over a set of models to predict the outcome for the patient case at hand, and employs a patient-specific heuristic to locate a set of suitable models to average over. We evaluate the utility of using a local structure representation for the conditional probability distributions in the MB models that captures additional independence relations among the variables compared to the typically used representation that captures only the global structure among the variables. In addition, we compare the performance of Bayesian model averaging to that of model selection. The patient-specific algorithm and its variants were evaluated on two clinical datasets for two outcomes. Our results provide support that the performance of an algorithm for learning patient-specific models can be improved by using a local structure representation for MB models and by performing Bayesian model averaging.
Journal of the American Medical Informatics Association | 2016
Jessica D. Tenenbaum; Paul Avillach; Marge M. Benham-Hutchins; Matthew K. Breitenstein; Erin L. Crowgey; Mark A. Hoffman; Xia Jiang; Subha Madhavan; John E. Mattison; Radhakrishnan Nagarajan; Bisakha Ray; Dmitriy Shin; Shyam Visweswaran; Zhongming Zhao; Robert R. Freimuth
The recent announcement of the Precision Medicine Initiative by President Obama has brought precision medicine (PM) to the forefront for healthcare providers, researchers, regulators, innovators, and funders alike. As technologies continue to evolve and datasets grow in magnitude, a strong computational infrastructure will be essential to realize PM’s vision of improved healthcare derived from personal data. In addition, informatics research and innovation affords a tremendous opportunity to drive the science underlying PM. The informatics community must lead the development of technologies and methodologies that will increase the discovery and application of biomedical knowledge through close collaboration between researchers, clinicians, and patients. This perspective highlights seven key areas that are in need of further informatics research and innovation to support the realization of PM.
Cancer | 2014
Ali H. Zaidi; Vanathi Gopalakrishnan; Pashtoon Murtaza Kasi; Xuemei Zeng; Usha Malhotra; Jeya Balaji Balasubramanian; Shyam Visweswaran; Mai Sun; Melanie S. Flint; Jon M. Davison; Brian L. Hood; Thomas P. Conrads; Jacques J. Bergman; William L. Bigbee; Blair A. Jobe
Esophageal adenocarcinoma (EAC) is associated with a dismal prognosis. The identification of cancer biomarkers can advance the possibility for early detection and better monitoring of tumor progression and/or response to therapy. The authors present results from the development of a serum‐based, 4‐protein (biglycan, myeloperoxidase, annexin‐A6, and protein S100‐A9) biomarker panel for EAC.
Journal of Biomedical Informatics | 2012
Danielle L. Mowery; Janyce Wiebe; Shyam Visweswaran; Henk Harkema; Wendy W. Chapman
Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP frameworks usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohens kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F(1) scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks.
