Jyotishman Pathak | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jyotishman Pathak is active.

Explore More

Publication

Featured researches published by Jyotishman Pathak.

Journal of Biomedical Informatics | 2016

Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function

Vahid Taslimitehrani; Guozhu Dong; Naveen L. Pereira; Maryam Panahiazar; Jyotishman Pathak

Computerized survival prediction in healthcare identifying the risk of disease mortality, helps healthcare providers to effectively manage their patients by providing appropriate treatment options. In this study, we propose to apply a classification algorithm, Contrast Pattern Aided Logistic Regression (CPXR(Log)) with the probabilistic loss function, to develop and validate prognostic risk models to predict 1, 2, and 5year survival in heart failure (HF) using data from electronic health records (EHRs) at Mayo Clinic. The CPXR(Log) constructs a pattern aided logistic regression model defined by several patterns and corresponding local logistic regression models. One of the models generated by CPXR(Log) achieved an AUC and accuracy of 0.94 and 0.91, respectively, and significantly outperformed prognostic models reported in prior studies. Data extracted from EHRs allowed incorporation of patient co-morbidities into our models which helped improve the performance of the CPXR(Log) models (15.9% AUC improvement), although did not improve the accuracy of the models built by other classifiers. We also propose a probabilistic loss function to determine the large error and small error instances. The new loss function used in the algorithm outperforms other functions used in the previous studies by 1% improvement in the AUC. This study revealed that using EHR data to build prediction models can be very challenging using existing classification methods due to the high dimensionality and complexity of EHR data. The risk models developed by CPXR(Log) also reveal that HF is a highly heterogeneous disease, i.e., different subgroups of HF patients require different types of considerations with their diagnosis and treatment. Our risk models provided two valuable insights for application of predictive modeling techniques in biomedicine: Logistic risk models often make systematic prediction errors, and it is prudent to use subgroup based prediction models such as those given by CPXR(Log) when investigating heterogeneous diseases.

BMC Psychiatry | 2016

Quantifying the impact of chronic conditions on a diagnosis of major depressive disorder in adults: a cohort study using linked electronic medical records.

Euijung Ryu; Alanna M. Chamberlain; Richard S. Pendegraft; Tanya M. Petterson; William V. Bobo; Jyotishman Pathak

BackgroundMajor depressive disorder (MDD) is often comorbid with other chronic mental and physical health conditions. Although the literature widely acknowledges the association of many chronic conditions with the risk of MDD, the relative importance of these conditions on MDD risk in the presence of other conditions is not well investigated. In this study, we aimed to quantify the relative contribution of selected chronic conditions to identify the conditions most influential to MDD risk in adults and identify differences by age.MethodsThis study used electronic health record (EHR) data on patients empanelled with primary care at Mayo Clinic in June 2013. A validated EHR-based algorithm was applied to identify newly diagnosed MDD patients between 2000 and 2013. Non-MDD controls were matched 1:1 to MDD cases on birth year (±2xa0years), sex, and outpatient clinic visits in the same year of MDD case diagnosis. Twenty-four chronic conditions defined by Chronic Conditions Data Warehouse were ascertained in both cases and controls using diagnosis codes within 5xa0years of index dates (diagnosis dates for cases, and the first clinic visit dates for matched controls). For each age group (45xa0years or younger, between 46 and 60, and over 60xa0years), conditional logistic regression models were used to test the association between each condition and subsequent MDD risk, adjusting for educational attainment and obesity. The relative influence of these conditions on the risk of MDD was quantified using gradient boosting machine models.ResultsA total of 11,375 incident MDD cases were identified between 2000 and 2013. Most chronic conditions (except for eye conditions) were associated with risk of MDD, with different association patterns observed depending on age. Among 24 chronic conditions, the greatest relative contribution was observed for diabetes mellitus for subjects agedu2009≤u200960xa0years and rheumatoid arthritis/osteoarthritis for those over 60xa0years.ConclusionsOur results suggest that specific chronic conditions such as diabetes mellitus and rheumatoid arthritis/osteoarthritis may have greater influence than others on the risk of MDD.

Journal of Cardiovascular Translational Research | 2016

Improvement in Cardiovascular Risk Prediction with Electronic Health Records

Mindy M. Pike; Paul A. Decker; Nicholas B. Larson; Jennifer L. St. Sauver; Paul Y. Takahashi; Véronique L. Roger; Walter A. Rocca; Virginia M. Miller; Janet E. Olson; Jyotishman Pathak; Suzette J. Bielinski

The aim of this study was to compare the QRISKII, an electronic health data-based risk score, to the Framingham Risk Score (FRS) and atherosclerotic cardiovascular disease (ASCVD) score. Risk estimates were calculated for a cohort of 8783 patients, and the patients were followed up from November 29, 2012, through June 1, 2015, for a cardiovascular disease (CVD) event. During follow-up, 246 men and 247 women had a CVD event. Cohen’s kappa statistic for the comparison of the QRISKII and FRS was 0.22 for men and 0.23 for women, with the QRISKII classifying more patients in the higher-risk groups. The QRISKII and ASCVD were more similar with kappa statistics of 0.49 for men and 0.51 for women. The QRISKII shows increased discrimination with area under the curve (AUC) statistics of 0.65 and 0.71, respectively, compared to the FRS (0.59 and 0.66) and ASCVD (0.63 and 0.69). These results demonstrate that incorporating additional data from the electronic health record (EHR) may improve CVD risk stratification.

advances in social networks analysis and mining | 2017

Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media

Amir Hossein Yazdavar; Hussein S. Al-Olimat; Monireh Ebrahimi; Goonmeet Bajaj; Tanvi Banerjee; Krishnaprasad Thirunarayan; Jyotishman Pathak; Amit P. Sheth

With the rise of social media, millions of people are routinely expressing their moods, feelings, and daily struggles with mental health issues on social media platforms like Twitter. Unlike traditional observational cohort studies conducted through questionnaires and self-reported surveys, we explore the reliable detection of clinical depression from tweets obtained unobtrusively. Based on the analysis of tweets crawled from users with self-reported depressive symptoms in their Twitter profiles, we demonstrate the potential for detecting clinical depression symptoms which emulate the PHQ-9 questionnaire clinicians use today. Our study uses a semi-supervised statistical model to evaluate how the duration of these symptoms and their expression on Twitter (in terms of word usage patterns and topical preferences) align with the medical findings reported via the PHQ-9. Our proactive and automatic screening tool is able to identify clinical depressive symptoms with an accuracy of 68% and precision of 72%.

Thrombosis and Haemostasis | 2017

Identification of unique venous thromboembolism-susceptibility variants in African-Americans

John A. Heit; Sebastian M. Armasu; Bryan M. McCauley; Iftikhar J. Kullo; Hugues Sicotte; Jyotishman Pathak; Christopher G. Chute; Omri Gottesman; Erwin P. Bottinger; Joshua C. Denny; Dan M. Roden; Rongling Li; Marylyn D. Ritchie; M. De Andrade

To identify novel single nucleotide polymorphisms (SNPs) associated with venous thromboembolism (VTE) in African-Americans (AAs), we performed a genome-wide association study (GWAS) of VTE in AAs using the Electronic Medical Records and Genomics (eMERGE) Network, comprised of seven sites each with DNA biobanks (total ~39,200 unique DNA samples) with genome-wide SNP data (imputed to 1000 Genomes Project cosmopolitan reference panel) and linked to electronic health records (EHRs). Using a validated EHR-driven phenotype extraction algorithm, we identified VTE cases and controls and tested for an association between each SNP and VTE using unconditional logistic regression, adjusted for age, sex, stroke, site-platform combination and sickle cell risk genotype. Among 393 AA VTE cases and 4,941 AA controls, three intragenic SNPs reached genome-wide significance: LEMD3 rs138916004 (OR=3.2; p=1.3E-08), LY86 rs3804476 (OR=1.8; p=2E-08) and LOC100130298 rs142143628 (OR=4.5; p=4.4E-08); all three SNPs validated using internal cross-validation, parametric bootstrap and meta-analysis methods. LEMD3 rs138916004 and LOC100130298 rs142143628 are only present in Africans (1000G data). LEMD3 showed a significant differential expression in both NCBI Gene Expression Omnibus (GEO) and the Mayo Clinic gene expression data, LOC100130298 showed a significant differential expression only in the GEO expression data, and LY86 showed a significant differential expression only in the Mayo expression data. LEMD3 encodes for an antagonist of TGF-β-induced cell proliferation arrest. LY86 encodes for MD-1 which down-regulates the pro-inflammatory response to lipopolysaccharide; LY86 variation was previously associated with VTE in white women; LOC100130298 is a non-coding RNA gene with unknown regulatory activity in gene expression and epigenetics.

Journal of Biomedical Informatics | 2016

Developing a data element repository to support EHR-driven phenotype algorithm authoring and execution

Guoqian Jiang; Richard C. Kiefer; Luke V. Rasmussen; Harold R. Solbrig; Huan Mo; Jennifer A. Pacheco; Jie Xu; Enid Montague; William K. Thompson; Joshua C. Denny; Christopher G. Chute; Jyotishman Pathak

The Quality Data Model (QDM) is an information model developed by the National Quality Forum for representing electronic health record (EHR)-based electronic clinical quality measures (eCQMs). In conjunction with the HL7 Health Quality Measures Format (HQMF), QDM contains core elements that make it a promising model for representing EHR-driven phenotype algorithms for clinical research. However, the current QDM specification is available only as descriptive documents suitable for human readability and interpretation, but not for machine consumption. The objective of the present study is to develop and evaluate a data element repository (DER) for providing machine-readable QDM data element service APIs to support phenotype algorithm authoring and execution. We used the ISO/IEC 11179 metadata standard to capture the structure for each data element, and leverage Semantic Web technologies to facilitate semantic representation of these metadata. We observed there are a number of underspecified areas in the QDM, including the lack of model constraints and pre-defined value sets. We propose a harmonization with the models developed in HL7 Fast Healthcare Interoperability Resources (FHIR) and Clinical Information Modeling Initiatives (CIMI) to enhance the QDM specification and enable the extensibility and better coverage of the DER. We also compared the DER with the existing QDM implementation utilized within the Measure Authoring Tool (MAT) to demonstrate the scalability and extensibility of our DER-based approach.

Clinical and Translational Science | 2018

Electronic Health Record Phenotypes for Precision Medicine: Perspectives and Caveats from Treatment of Breast Cancer at a Single Institution

Matthew K. Breitenstein; Hongfang Liu; Kara N. Maxwell; Jyotishman Pathak; Rui Zhang

Precision medicine is at the forefront of biomedical research. Cancer registries provide rich perspectives and electronic health records (EHRs) are commonly utilized to gather additional clinical data elements needed for translational research. However, manual annotation is resource‐intense and not readily scalable. Informatics‐based phenotyping presents an ideal solution, but perspectives obtained can be impacted by both data source and algorithm selection. We derived breast cancer (BC) receptor status phenotypes from structured and unstructured EHR data using rule‐based algorithms, including natural language processing (NLP). Overall, the use of NLP increased BC receptor status coverage by 39.2% from 69.1% with structured medication information alone. Using all available EHR data, estrogen receptor‐positive BC cases were ascertained with high precision (P = 0.976) and recall (R = 0.987) compared with gold standard chart‐reviewed patients. However, status negation (R = 0.591) decreased 40.2% when relying on structured medications alone. Using multiple EHR data types (and thorough understanding of the perspectives offered) are necessary to derive robust EHR‐based precision medicine phenotypes.

Applied Clinical Informatics | 2016

Practical considerations for implementing genomic information resources

Luke V. Rasmussen; Casey Lynnette Overby; John J. Connolly; Christopher G. Chute; Joshua C. Denny; Robert R. Freimuth; Andrea L. Hartzler; Ingrid A. Holm; Shannon Manzi; Jyotishman Pathak; Peggy L. Peissig; Maureen E. Smith; Marc S. Williams; Brian Shirts; Elena M. Stoffel; Peter Tarczy-Hornoch; C. R. Rohrer Vitek; Wendy A. Wolf; Justin Starren

OBJECTIVESnTo understand opinions and perceptions on the state of information resources specifically targeted to genomics, and approaches to delivery in clinical practice.nnnMETHODSnWe conducted a survey of genomic content use and its clinical delivery from representatives across eight institutions in the electronic Medical Records and Genomics (eMERGE) network and two institutions in the Clinical Sequencing Exploratory Research (CSER) consortium in 2014.nnnRESULTSnEleven responses representing distinct projects across ten sites showed heterogeneity in how content is being delivered, with provider-facing content primarily delivered via the electronic health record (EHR) (n=10), and paper/pamphlets as the leading mode for patient-facing content (n=9). There was general agreement (91%) that new content is needed for patients and providers specific to genomics, and that while aspects of this content could be shared across institutions there remain site-specific needs (73% in agreement).nnnCONCLUSIONnThis work identifies a need for the improved access to and expansion of information resources to support genomic medicine, and opportunities for content developers and EHR vendors to partner with institutions to develop needed resources, and streamline their use - such as a central content site in multiple modalities while implementing approaches to allow for site-specific customization.

conference on information and knowledge management | 2018

Let Me Tell You About Your Mental Health!: Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention

Manas Gaur; Ugur Kursuncu; Amanuel Alambo; Amit P. Sheth; Raminta Daniulaityte; Krishnaprasad Thirunarayan; Jyotishman Pathak

Social media platforms are increasingly being used to share and seek advice on mental health issues. In particular, Reddit users freely discuss such issues on various subreddits, whose structure and content can be leveraged to formally interpret and relate subreddits and their posts in terms of mental health diagnostic categories. There is prior research on the extraction of mental health-related information, including symptoms, diagnosis, and treatments from social media; however, our approach can additionally provide actionable information to clinicians about the mental health of a patient in diagnostic terms for web-based intervention. Specifically, we provide a detailed analysis of the nature of subreddit content from domain experts perspective and introduce a novel approach to map each subreddit to the best matching DSM-5 (Diagnostic and Statistical Manual of Mental Disorders - 5th Edition) category using multi-class classifier. Our classification algorithm analyzes all the posts of a subreddit by adapting topic modeling and word-embedding techniques, and utilizing curated medical knowledge bases to quantify relationship to DSM-5 categories. Our semantic encoding-decoding optimization approach reduces the false-alarm-rate from 30% to 2.5% over a comparable heuristic baseline, and our mapping results have been verified by domain experts achieving a kappa score of 0.84.

PLOS Currents | 2018

Risk Factors for Depression Among Civilians After the 9/11 World Trade Center Terrorist Attacks: A Systematic Review and Meta-Analysis

Abhinaba Chatterjee; Samprit Banerjee; Cheryl Stein; Min-hyung Kim; Joseph DeFerio; Jyotishman Pathak

Introduction: The development of depressive symptoms among the population of civilians who were not directly involved in recovery or rescue efforts following the 9/11 World Trade Center (WTC) terrorist attacks is not comprehensively understood. We performed a meta-analysis that examined the associations between multiple risk factors and depressive symptoms after the 9/11 WTC terrorist attacks in New York City among civilians including survivors, residents, and passersby. Methods: PubMed, Google Scholar, and the Cochrane Library were searched from September, 2001 through July, 2016. Reviewers identified eligible studies and synthesized odds ratios (ORs) using a random-effects model. Results: The meta-analysis included findings from 7 studies (29,930 total subjects). After adjusting for multiple comparisons, depressive symptoms were significantly associated with minority race/ethnicity (OR, 1.40; 99.5% Confidence Interval [CI], 1.04 to 1.88), lower income level (OR, 1.25; 99.5% CI, 1.09 to 1.43), post-9/11 social isolation (OR, 1.68; 99.5% CI, 1.13 to 2.49), post-9/11 change in employment (OR, 2.06; 99.5% CI, 1.30 to 3.26), not being married post-9/11 (OR, 1.59; 99.5% CI, 1.18 to 2.15), and knowing someone injured or killed (OR, 2.02; 99.5% CI, 1.42 to 2.89). Depressive symptoms were not significantly associated with greater age (OR, 0.86; 99.5% CI, 0.70 to 1.05), no college degree (OR, 1.32; 99.5% CI, 0.96 to 1.83), female sex (OR, 1.24; 99.5% CI, 0.98 to 1.59), or direct exposure to WTC related traumatic events (OR, 1.26; 99.5% CI, 0.69 to 2.30). Discussion: Findings from this study suggest that lack of post-disaster social capital was most strongly associated with depressive symptoms among the civilian population after the 9/11 WTC terrorist attacks, followed by bereavement and lower socioeconomic status. These risk factors should be identified among civilians in future disaster response efforts.

Explore More