Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Maryam Panahiazar is active.

Publication


Featured researches published by Maryam Panahiazar.


Journal of Biomedical Informatics | 2016

Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function

Vahid Taslimitehrani; Guozhu Dong; Naveen L. Pereira; Maryam Panahiazar; Jyotishman Pathak

Computerized survival prediction in healthcare identifying the risk of disease mortality, helps healthcare providers to effectively manage their patients by providing appropriate treatment options. In this study, we propose to apply a classification algorithm, Contrast Pattern Aided Logistic Regression (CPXR(Log)) with the probabilistic loss function, to develop and validate prognostic risk models to predict 1, 2, and 5year survival in heart failure (HF) using data from electronic health records (EHRs) at Mayo Clinic. The CPXR(Log) constructs a pattern aided logistic regression model defined by several patterns and corresponding local logistic regression models. One of the models generated by CPXR(Log) achieved an AUC and accuracy of 0.94 and 0.91, respectively, and significantly outperformed prognostic models reported in prior studies. Data extracted from EHRs allowed incorporation of patient co-morbidities into our models which helped improve the performance of the CPXR(Log) models (15.9% AUC improvement), although did not improve the accuracy of the models built by other classifiers. We also propose a probabilistic loss function to determine the large error and small error instances. The new loss function used in the algorithm outperforms other functions used in the previous studies by 1% improvement in the AUC. This study revealed that using EHR data to build prediction models can be very challenging using existing classification methods due to the high dimensionality and complexity of EHR data. The risk models developed by CPXR(Log) also reveal that HF is a highly heterogeneous disease, i.e., different subgroups of HF patients require different types of considerations with their diagnosis and treatment. Our risk models provided two valuable insights for application of predictive modeling techniques in biomedicine: Logistic risk models often make systematic prediction errors, and it is prudent to use subgroup based prediction models such as those given by CPXR(Log) when investigating heterogeneous diseases.


world congress on services | 2011

Web Service Composition Using Service Suggestions

Rui Wang; Chaitanya Guttula; Maryam Panahiazar; Haseeb Yousaf; John A. Miller; Eileen Kraemer; Jessica C. Kissinger

This paper presents a semi-automatic Web service composition approach. This approach ranks all available candidate Web service operations based on semantic annotations and suggests service operations to a human designer during the process of Web service composition. The ranking scores are based on data mediation, functionality and formal service specifications. A formal graph model, an IODAG, is defined to formalize an input/output schema of a Web service operation. Three data mediation algorithms are developed to handle the data heterogeneities arising during Web service composition. The data mediation algorithms analyze the schemas of the inputs and outputs of service operations and consider the structures of the schemas. A typed representation for our data mediation approach, which formalizes the data mediation problem as a subtype-checking problem, is presented. An evaluation is performed to study the effectiveness of different data mediation and service suggestion algorithms used to assist designers composing Web services.


medical informatics europe | 2015

Using EHRs for Heart Failure Therapy Recommendation Using Multidimensional Patient Similarity Analytics.

Maryam Panahiazar; Vahid Taslimitehrani; Naveen L. Pereira; Jyotishman Pathak

Electronic Health Records (EHRs) contain a wealth of information about an individual patient’s diagnosis, treatment and health outcomes. This information can be leveraged effectively to identify patients who are similar to each for disease diagnosis and prognosis. In recent years, several machine learning methods1 have been proposed to assessing patient similarity, although the techniques have primarily focused on the use of patient diagnoses data from EHRs for the learning task. In this study, we develop a multidimensional patient similarity assessment technique that leverages multiple types of information from the EHR and predicts a medication plan for each new patient based on prior knowledge and data from similar patients. In our algorithm, patients have been clustered into different groups using a hierarchical clustering approach and subsequently have been assigned a medication plan based on the similarity index to the overall patient population. We evaluated the performance of our approach on a cohort of heart failure patients (N=1386) identified from EHR data at Mayo Clinic and achieved an AUC of 0.74. Our results suggest that it is feasible to harness population-based information from EHRs for an individual patient-specific assessment.


BMC Medical Genomics | 2013

Advancing data reuse in phyloinformatics using an ontology-driven Semantic Web approach

Maryam Panahiazar; Amit P. Sheth; Ajith Harshana Ranabahu; Rutger A. Vos; Jim Leebens-Mack

Phylogenetic analyses can resolve historical relationships among genes, organisms or higher taxa. Understanding such relationships can elucidate a wide range of biological phenomena, including, for example, the importance of gene and genome duplications in the evolution of gene function, the role of adaptation as a driver of diversification, or the evolutionary consequences of biogeographic shifts. Phyloinformaticists are developing data standards, databases and communication protocols (e.g. Application Programming Interfaces, APIs) to extend the accessibility of gene trees, species trees, and the metadata necessary to interpret these trees, thus enabling researchers across the life sciences to reuse phylogenetic knowledge. Specifically, Semantic Web technologies are being developed to make phylogenetic knowledge interpretable by web agents, thereby enabling intelligently automated, high-throughput reuse of results generated by phylogenetic research. This manuscript describes an ontology-driven, semantic problem-solving environment for phylogenetic analyses and introduces artefacts that can promote phyloinformatic efforts to promote accessibility of trees and underlying metadata. PhylOnt is an extensible ontology with concepts describing tree types and tree building methodologies including estimation methods, models and programs. In addition we present the PhylAnt platform for annotating scientific articles and NeXML files with PhylOnt concepts. The novelty of this work is the annotation of NeXML files and phylogenetic related documents with PhylOnt Ontology. This approach advances data reuse in phyloinformatics.


Scientific Data | 2017

Precision annotation of digital samples in NCBI’s gene expression omnibus

Dexter Hadley; James Pan; Osama El-Sayed; Jihad Aljabban; Imad Aljabban; Tej D. Azad; Mohamad Omar Hadied; Shuaib Raza; Benjamin Abhishek Rayikanti; Bin Chen; Hyojung Paik; Dvir Aran; Jordan Spatz; Daniel Himmelstein; Maryam Panahiazar; Sanchita Bhattacharya; Marina Sirota; Mark A. Musen; Atul J. Butte

The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional genomics experiments amassed over almost two decades. However, individual sample meta-data remains poorly described by unstructured free text attributes preventing its largescale reanalysis. We introduce the Search Tag Analyze Resource for GEO as a web application (http://STARGEO.org) to curate better annotations of sample phenotypes uniformly across different studies, and to use these sample annotations to define robust genomic signatures of disease pathology by meta-analysis. In this paper, we target a small group of biomedical graduate students to show rapid crowd-curation of precise sample annotations across all phenotypes, and we demonstrate the biological validity of these crowd-curated annotations for breast cancer. STARGEO.org makes GEO data findable, accessible, interoperable and reusable (i.e., FAIR) to ultimately facilitate knowledge discovery. Our work demonstrates the utility of crowd-curation and interpretation of open ‘big data’ under FAIR principles as a first step towards realizing an ideal paradigm of precision medicine.


Database | 2016

Predicting structured metadata from unstructured metadata

Lisa Posch; Maryam Panahiazar; Michel Dumontier; Olivier Gevaert

[This corrects the article DOI: 10.1093/database/baw080.].


Journal of Digital Imaging | 2018

Large Scale Semi-Automated Labeling of Routine Free-Text Clinical Records for Deep Learning

Hari Trivedi; Maryam Panahiazar; April S. Liang; Dmytro S. Lituiev; Peter Chang; Jae Ho Sohn; Yunn-Yi Chen; Benjamin L. Franc; Bonnie N. Joe; Dexter Hadley

Breast cancer is a leading cause of cancer death among women in the USA. Screening mammography is effective in reducing mortality, but has a high rate of unnecessary recalls and biopsies. While deep learning can be applied to mammography, large-scale labeled datasets, which are difficult to obtain, are required. We aim to remove many barriers of dataset development by automatically harvesting data from existing clinical records using a hybrid framework combining traditional NLP and IBM Watson. An expert reviewer manually annotated 3521 breast pathology reports with one of four outcomes: left positive, right positive, bilateral positive, negative. Traditional NLP techniques using seven different machine learning classifiers were compared to IBM Watson’s automated natural language classifier. Techniques were evaluated using precision, recall, and F-measure. Logistic regression outperformed all other traditional machine learning classifiers and was used for subsequent comparisons. Both traditional NLP and Watson’s NLC performed well for cases under 1024 characters with weighted average F-measures above 0.96 across all classes. Performance of traditional NLP was lower for cases over 1024 characters with an F-measure of 0.83. We demonstrate a hybrid framework using traditional NLP techniques combined with IBM Watson to annotate over 10,000 breast pathology reports for development of a large-scale database to be used for deep learning in mammography. Our work shows that traditional NLP and IBM Watson perform extremely well for cases under 1024 characters and can accelerate the rate of data annotation.


Informatics | 2018

Large Scale Advanced Data Analytics on Skin Conditions from Genotype to Phenotype

Maryam Panahiazar; Darya Fadavi; Jihad Aljabban; Laraib Safeer; Imad Aljabban; Dexter Hadley

A crucial factor in Big Data is to take advantage of available data and use that for new discovery or hypothesis generation. In this study, we analyzed Large-scale data from the literature to OMICS, such as the genome, proteome or metabolome, respectively, for skin conditions. Skin acts as a natural barrier to the world around us and protects our body from different conditions, viruses, and bacteria, and plays a big part in appearance. We have included Hyperpigmentation, Postinflammatory Hyperpigmentation, Melasma, Rosacea, Actinic keratosis, and Pigmentation in this study. These conditions have been selected based on reasoning of big scale UCSF patient data of 527,273 females from 2011 to 2017, and related publications from 2000 to 2017 regarding skin conditions. The selected conditions have been confirmed with experts in the field from different research centers and hospitals. We proposed a novel framework for large-scale available public data to find the common genotypes and phenotypes of different skin conditions. The outcome of this study based on Advance Data Analytics provides information on skin conditions and their treatments to the research community and introduces new hypotheses for possible genotype and phenotype targets. The novelty of this work is a meta-analysis of different features on different skin conditions. Instead of looking at individual conditions with one or two features, which is how most of the previous works are conducted, we looked at several conditions with different features to find the common factors between them. Our hypothesis is that by finding the overlap in genotype and phenotype between different skin conditions, we can suggest using a drug that is recommended in one condition, for treatment in the other condition which has similar genes or other common phenotypes. We identified common genes between these skin conditions and were able to find common areas for targeting between conditions, such as common drugs. Our work has implications for discovery and new hypotheses to improve health quality, and is geared towards making Big Data useful.


Journal of Biomedical Informatics | 2017

Predicting biomedical metadata in CEDAR: A study of Gene Expression Omnibus (GEO)

Maryam Panahiazar; Michel Dumontier; Olivier Gevaert

A crucial and limiting factor in data reuse is the lack of accurate, structured, and complete descriptions of data, known as metadata. Towards improving the quantity and quality of metadata, we propose a novel metadata prediction framework to learn associations from existing metadata that can be used to predict metadata values. We evaluate our framework in the context of experimental metadata from the Gene Expression Omnibus (GEO). We applied four rule mining algorithms to the most common structured metadata elements (sample type, molecular type, platform, label type and organism) from over 1.3million GEO records. We examined the quality of well supported rules from each algorithm and visualized the dependencies among metadata elements. Finally, we evaluated the performance of the algorithms in terms of accuracy, precision, recall, and F-measure. We found that PART is the best algorithm outperforming Apriori, Predictive Apriori, and Decision Table. All algorithms perform significantly better in predicting class values than the majority vote classifier. We found that the performance of the algorithms is related to the dimensionality of the GEO elements. The average performance of all algorithm increases due of the decreasing of dimensionality of the unique values of these elements (2697 platforms, 537 organisms, 454 labels, 9 molecules, and 5 types). Our work suggests that experimental metadata such as present in GEO can be accurately predicted using rule mining algorithms. Our work has implications for both prospective and retrospective augmentation of metadata quality, which are geared towards making data easier to find and reuse.


F1000Research | 2017

CEDAR's predictive data entry: easier and faster creation of high-quality metadata

Marcos Martínez-Romero; Martin J. O’Connor; Ravi D. Shankar; Maryam Panahiazar; Debra Willrett; Attila L. Egyedi; Olivier Gevaert; John Graybeal; Mark A. Musen

The Value Recommender system is the first of a planned set of intelligent authoring components in the CEDAR system. Future efforts will concentrate on deeper analyses of metadata to discover more interesting relationships between metadata fields, which will then drive new tools to assist biomedical investigators when annotating their data. The Value Recommender supports both freetext values and controlled terms. In this particular example, the system suggests terms from the Human Disease Ontology (DOID).

Collaboration


Dive into the Maryam Panahiazar's collaboration.

Top Co-Authors

Avatar

Dexter Hadley

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Laraib Safeer

Baylor College of Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kamal Khorfan

Henry Ford Health System

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge