Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where György J. Simon is active.

Publication


Featured researches published by György J. Simon.


PLOS ONE | 2013

Multi-Platform Analysis of MicroRNA Expression Measurements in RNA from Fresh Frozen and FFPE Tissues

Christopher P. Kolbert; Rod M. Feddersen; Fariborz Rakhshan; Diane E. Grill; György J. Simon; Sumit Middha; Jin Sung Jang; Vernadette Simon; Debra A. Schultz; Michael A. Zschunke; Wilma L. Lingle; Jennifer M. Carr; E. Aubrey Thompson; Ann L. Oberg; Bruce W. Eckloff; Eric D. Wieben; Peter W. Li; Ping Yang; Jin Jen

MicroRNAs play a role in regulating diverse biological processes and have considerable utility as molecular markers for diagnosis and monitoring of human disease. Several technologies are available commercially for measuring microRNA expression. However, cross-platform comparisons do not necessarily correlate well, making it difficult to determine which platform most closely represents the true microRNA expression level in a tissue. To address this issue, we have analyzed RNA derived from cell lines, as well as fresh frozen and formalin-fixed paraffin embedded tissues, using Affymetrix, Agilent, and Illumina microRNA arrays, NanoString counting, and Illumina Next Generation Sequencing. We compared the performance within- and between the different platforms, and then verified these results with those of quantitative PCR data. Our results demonstrate that the within-platform reproducibility for each method is consistently high and although the gene expression profiles from each platform show unique traits, comparison of genes that were commonly detectable showed that detection of microRNA transcripts was similar across multiple platforms.


knowledge discovery and data mining | 2011

A simple statistical model and association rule filtering for classification

György J. Simon; Vipin Kumar; Peter W. Li

Associative classification is a predictive modeling technique that constructs a classifier based on class association rules (also known as predictive association rules; PARs). PARs are association rules where the consequence of the rule is a class label. Associative classification has gained substantial research attention because it successfully joins the benefits of association rule mining with classification. These benefits include the inherent ability of association rule mining to extract high-order interactions among the predictors--an ability that many modern classifiers lack--and also the natural interpretability of the individual PARs. Associative classification is not without its caveats. Association rule mining often discovers a combinatorially large number of association rules, eroding the interpretability of the rule set. Extensive effort has been directed towards developing interestingness measures, which filter (predictive) association rules after they have been generated. These interestingness measures, albeit very successful at selecting interesting rules, lack two features that are highly valuable in the context of classification. First, only few of the interestingness measures are rooted in a statistical model. Given the distinction between a training and a test data set in the classification setting, the ability to make statistical inferences about the performance of the predictive classification rules on the test set is highly desirable. Second, the unfiltered set of predictive assocation rules (PARs) are often redundant, we can prove that certain PARs will not be used to construct a classification model given the presence of other PARs. In this paper, we propose a simple statistical model towards making inferences on the test set about the various performance metrics of predictive association rules. We also derive three filtering criteria based on hypothesis testing, which are very selective (reduce the number of PARs to be considered by the classifier by several orders of magnitude), yet do not effect the performance of the classification adversely. In the case, where the classification model is constructed as a logistic model on top of the PARs, we can mathematically prove, that the filtering criteria do not significantly effect the classifiers performance. We also demonstrate empirically on three publicly available data sets that the vast reduction in the number of PARs indeed did not come at the cost of reducing the predictive performance.


Studies in health technology and informatics | 2015

Automated Detection of Postoperative Surgical Site Infections Using Supervised Methods with Electronic Health Record Data

Zhen Hu; György J. Simon; Elliot G. Arsoniadis; Yan Wang; Mary R. Kwaan; Genevieve B. Melton

The National Surgical Quality Improvement Project (NSQIP) is widely recognized as “the best in the nation” surgical quality improvement resource in the United States. In particular, it rigorously defines postoperative morbidity outcomes, including surgical adverse events occurring within 30 days of surgery. Due to its manual yet expensive construction process, the NSQIP registry is of exceptionally high quality, but its high cost remains a significant bottleneck to NSQIP’s wider dissemination. In this work, we propose an automated surgical adverse events detection tool, aimed at accelerating the process of extracting postoperative outcomes from medical charts. As a prototype system, we combined local EHR data with the NSQIP gold standard outcomes and developed machine learned models to retrospectively detect Surgical Site Infections (SSI), a particular family of adverse events that NSQIP extracts. The built models have high specificity (from 0.788 to 0.988) as well as very high negative predictive values (>0.98), reliably eliminating the vast majority of patients without SSI, thereby significantly reducing the NSQIP extractors’ burden.


international conference on data mining | 2015

Forensic Style Analysis with Survival Trajectories

Pranjul Yadav; Michael Steinbach; Lisiane Pruinelli; Bonnie L. Westra; Connie Delaney; Vipin Kumar; György J. Simon

Electronic Health Records (EHRs) consists of patient information such as demographics, medications, laboratory test results, diagnosis codes and procedures. Mining EHRs could lead to improvement in patient healthcare management as EHRs contain detailed information related to disease prognosis for large patient populations. We hypothesize that a patients condition does not deteriorate at random, the trajectories, sequences in which diseases appear in a patient, are determined by a finite number of underlying disease mechanisms. In this work, we exploit this idea by predicting a patients risk of mortality in the context of the metabolic syndrome by assessing which of many available trajectories a patient is following and progression along this trajectory. Implementing this idea required innovative enhancements both for the study design and also for the fitting algorithm. We propose a forensic-style study design, which aligns patients on last follow-up and measures time backwards. We modify the time-dependent covariate Cox proportional hazards model to better capture coefficients of covariate that follow a particular temporal sequence, such as trajectories. Knowledge extracted from such analysis can lead to personalized treatments, thereby forming the basis for future trajectory-centered guidelines.


siam international conference on data mining | 2014

Mining interpretable and predictive diagnosis codes from multi-source electronic health records

Sanjoy Dey; György J. Simon; Bonnie L. Westra; Michael Steinbach; Vipin Kumar

Mining patterns from electronic health-care records (EHR) can potentially lead to better and more cost-effective treatments. We aim to find the groups of ICD-9 diagnosis codes from EHRs that can predict the improvement of urinary incontinence of home health care (HHC) patients and also are interpretable to domain experts. In this paper, we propose two approaches for increasing the interpretability of the obtained groups of ICD-9 codes. First, we incorporate prior information available from clinical domain knowledge using the clinical classification system (CCS). Second, we incorporate additional types of clinical information for the same patients, such as demographic, behavioral, physiological, and psycho-social variables available from survey questions during the hospital visits. Finally, we develop a hybrid framework that can combine both prior information and the datadriven clinical information in the predictive model framework. Our results obtained from a large-scale EHR data set show that the hybrid framework enhances clinical interpretability as compared to the baseline model obtained from ICD-9 codes only, while achieving almost the same predictive


Nursing Research | 2015

Mining Patterns Associated With Mobility Outcomes in Home Healthcare.

Sanjoy Dey; Jacob Cooner; Connie Delaney; Joanna Fakhoury; Vipin Kumar; György J. Simon; Michael Steinbach; Jeremy Weed; Bonnie L. Westra

BackgroundMobility is critical for self-management. Understanding factors associated with improvement in mobility during home healthcare can help nurses tailor interventions to improve mobility outcomes and keep patients safely at home. ObjectivesThe aims were to (a) identify patient and support system factors associated with mobility improvement during home care, (b) evaluate consistency of factors across groups defined by mobility status at the start of home care, and (c) identify patterns of factors associated with improvement and no improvement in mobility within each group. MethodsOutcome and Assessment Information Set data extracted from a national convenience sample of 270,634 patient records collected from October 1, 2008 to December 31, 2009 from 581 Medicare-certified, home healthcare agencies were used. Patients were placed into groups based on mobility scores at admission. Odds ratios were used to index associations of factors with improvement at discharge. Discriminative pattern mining was used to discover patterns associated with improvement of mobility. ResultsOverall, mobility improved for 49.4% of patients; improvement occurred most frequently (80%) among patients who were able, at admission, to walk only with the supervision or assistance of another person at all times. Numerous factors associated with improvement in mobility outcome were similar across the groups (except for those who were chairfast but were able to wheel themselves independently); however, the number, strength, and direction of associations varied. In most groups, data mining-discovered patterns of factors associated with the mobility outcome were composed of combinations of functional and cognitive status and the type and amount of help required at home. DiscussionThis study provides new data mining-based information about how factors associated with improvement in mobility group together and vary by mobility at admission. These approaches have potential to provide new insights for clinicians to tailor interventions for improvement of mobility.


Progress in Transplantation | 2017

Predictors of liver transplant patient survival: A critical review using a holistic framework

Lisiane Pruinelli; Karen A. Monsen; Cynthia R. Gross; David M. Radosevich; György J. Simon; Bonnie L. Westra

Objective: Liver transplantation is a costly and risky procedure, representing 25 050 procedures worldwide in 2013, with 6729 procedures performed in the United States in 2014. Considering the scarcity of organs and uncertainty regarding prognosis, limited studies address the variety of risk factors before transplantation that might contribute to predicting patient’s survival and therefore developing better models that address a holistic view of transplant patients. This critical review aimed to identify predictors of liver transplant patient survival included in large-scale studies and assess the gap in risk factors from a holistic approach using the Wellbeing Model and the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement. Data Source: Search of the Cumulative Index to Nursing and Allied Health Literature (CINAHL), Medline, and PubMed from the 1980s to July 2014. Study Selection: Original longitudinal large-scale studies, of 500 or more subjects, published in English, Spanish, or Portuguese, which described predictors of patient survival after deceased donor liver transplantation. Data Extraction: Predictors were extracted from 26 studies that met the inclusion criteria. Data Synthesis: Each article was reviewed and predictors were categorized using a holistic framework, the Wellbeing Model (health, community, environment, relationship, purpose, and security dimensions). Conclusions: The majority (69.7%) of the predictors represented the Wellbeing Model Health dimension. There were no predictors representing the Wellbeing Dimensions for purpose and relationship nor emotional, mental, and spiritual health. This review showed that there is rigorously conducted research of predictors of liver transplant survival; however, the reported significant results were inconsistent across studies, and further research is needed to examine liver transplantation from a whole-person perspective.


medical informatics europe | 2015

Using EHR-Linked Biobank Data to Study Metformin Pharmacogenomics

Matthew K. Breitenstein; György J. Simon; Euijung Ryu; Sebastian M. Armasu; Richard M. Weinshilboum; Liewei Wang; Jyotishman Pathak

Metformin is a commonly prescribed diabetes medication whose mechanism of action is poorly understood. In this study we utilized EHR-linked biobank data to elucidate the impact of genomic variation on glycemic response to metformin. Our study found significant gene- and SNP-level associations within the beta-2 subunit of the heterotrimeric adenosine monophosphate-activated protein kinase complex. Using EHR phenotypes where were able to add additional clarity to ongoing metformin pharmacogenomic dialogue.


knowledge discovery and data mining | 2011

Understanding atrophy trajectories in alzheimer's disease using association rules on MRI images

György J. Simon; Peter W. Li; Clifford R. Jack; Prashanthi Vemuri

Alzheimers disease (AD) is associated with progressive cognitive decline leading to dementia. The atrophy/loss of brain structure as seen on Magnetic Resonance Imaging (MRI) is strongly correlated with the severity of the cognitive impairment in AD. In this paper, we set out to find associations between predefined regions of the brain (regions of interest; ROIs) and the severity of the disease. Specifically, we use these associations to address two important issues in AD: (i) typical versus atypical atrophy patterns and (ii) the origin and direction of progression of atrophy, which is currently under debate. We observed that each AD-related ROI is associated with a wide range of severity and that the difference between ROIs is merely a difference in severity distribution. To model differences between the severity distribution of a subpopulation (with significant atrophy in certain ROIs) and the severity distribution of the entire population, we developed the concept of Distributional Association Rules. Using the Distributional Association Rules, we clustered ROIs into disease subsystems. We define a disease subsystem as a contiguous set of ROIs that are collectively implicated in AD. AD is known to be heterogeneous in the sense that multiple sets of ROIs may be related to the disease in a population. We proposed an enhancement to the association rule mining where the algorithm only discovers association rules with ROIs that form an approximately contiguous volume. Next, we applied these association rules to infer the direction of disease progression based on the support measures of the association rules. We also developed a novel statistical test to determine the statistical significance of the discovered direction. We evaluated the proposed method on the Mayo Clinic Alzheimers Disease Research Center (ADRC) prospective patient cohorts. The key achievements of the methodology is that it accurately identified larger disease subsystems implicated in typical and atypical AD and it successfully mapped the directions of disease progression. The wealth of data available in Radiology gives rise to opportunities for applying this methodology to map out the trajectory of several other diseases, e.g. other neuro-degenerative diseases and cancers, most notably, breast cancer. The applicability of this method is not limited to image data, as associating predictors with severity provides valuable information in most areas of medicine as well as other industries.


Studies in health technology and informatics | 2016

Clustering the Whole-Person Health Data to Predict Liver Transplant Survival.

Lisiane Pruinelli; Karen A. Monsen; György J. Simon; Bonnie L. Westra

This study aims to discover groups (clusters) of patient who share whole-person characteristics. An unsupervised clustering analysis using a hierarchical agglomerative approach was applied to identify meaningful groups of patient characteristics. Results showed that is possible to identify clusters that have similar patient characteristics, and that these characteristics may be associated with survival.

Collaboration


Dive into the György J. Simon's collaboration.

Top Co-Authors

Avatar

Vipin Kumar

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sanjoy Dey

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge