Saeed Hassanpour | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Saeed Hassanpour is active.

Explore More

Publication

Featured researches published by Saeed Hassanpour.

Artificial Intelligence in Medicine | 2016

Information extraction from multi-institutional radiology reports

Saeed Hassanpour; Curtis P. Langlotz

OBJECTIVES The radiology report is the most important source of clinical imaging information. It documents critical information about the patients health and the radiologists interpretation of medical findings. It also communicates information to the referring physicians and records that information for future clinical and research use. Although efforts to structure some radiology report information through predefined templates are beginning to bear fruit, a large portion of radiology report information is entered in free text. The free text format is a major obstacle for rapid extraction and subsequent use of information by clinicians, researchers, and healthcare information systems. This difficulty is due to the ambiguity and subtlety of natural language, complexity of described images, and variations among different radiologists and healthcare organizations. As a result, radiology reports are used only once by the clinician who ordered the study and rarely are used again for research and data mining. In this work, machine learning techniques and a large multi-institutional radiology report repository are used to extract the semantics of the radiology report and overcome the barriers to the re-use of radiology report information in clinical research and other healthcare applications. MATERIAL AND METHODS We describe a machine learning system to annotate radiology reports and extract report contents according to an information model. This information model covers the majority of clinically significant contents in radiology reports and is applicable to a wide variety of radiology study types. Our automated approach uses discriminative sequence classifiers for named-entity recognition to extract and organize clinically significant terms and phrases consistent with the information model. We evaluated our information extraction system on 150 radiology reports from three major healthcare organizations and compared its results to a commonly used non-machine learning information extraction method. We also evaluated the generalizability of our approach across different organizations by training and testing our system on data from different organizations. RESULTS Our results show the efficacy of our machine learning approach in extracting the information models elements (10-fold cross-validation average performance: precision: 87%, recall: 84%, F1 score: 85%) and its superiority and generalizability compared to the common non-machine learning approach (p-value<0.05). CONCLUSIONS Our machine learning information extraction approach provides an effective automatic method to annotate and extract clinically significant information from a large collection of free text radiology reports. This information extraction system can help clinicians better understand the radiology reports and prioritize their review process. In addition, the extracted information can be used by researchers to link radiology reports to information from other data sources such as electronic health records and the patients genome. Extracted information also can facilitate disease surveillance, real-time clinical decision support for the radiologist, and content-based image retrieval.

rules and rule markup languages for the semantic web | 2009

Exploration of SWRL Rule Bases through Visualization, Paraphrasing, and Categorization of Rules

Saeed Hassanpour; Martin J. O'Connor; Amar K. Das

Rule bases are increasingly being used as repositories of knowledge content on the Semantic Web. As the size and complexity of these rule bases increases, developers and end users need methods of rule abstraction to facilitate rule management. In this paper, we describe a rule abstraction method for Semantic Web Rule Language (SWRL) rules that is based on lexical analysis and a set of heuristics. Our method results in a tree data structure that we exploit in creating techniques to visualize, paraphrase, and categorize SWRL rules. We evaluate our approach by applying it to several biomedical ontologies that contain SWRL rules, and show how the results reveal rule patterns within the rule base. We have implemented our method as a plug-in tool for Protege-OWL, the most widely used ontology modeling software for the Semantic Web. Our tool can allow users to rapidly explore content and patterns in SWRL rule bases, enabling their acquisition and management.

Journal of Pathology Informatics | 2017

Deep learning for classification of colorectal polyps on whole-slide images

Bruno Korbar; Andrea M. Olofson; Allen P. Miraflor; Catherine M. Nicka; Matthew A. Suriawinata; Lorenzo Torresani; Arief A. Suriawinata; Saeed Hassanpour

Context: Histopathological characterization of colorectal polyps is critical for determining the risk of colorectal cancer and future rates of surveillance for patients. However, this characterization is a challenging task and suffers from significant inter- and intra-observer variability. Aims: We built an automatic image analysis method that can accurately classify different types of colorectal polyps on whole-slide images to help pathologists with this characterization and diagnosis. Setting and Design: Our method is based on deep-learning techniques, which rely on numerous levels of abstraction for data representation and have shown state-of-the-art results for various image analysis tasks. Subjects and Methods: Our method covers five common types of polyps (i.e., hyperplastic, sessile serrated, traditional serrated, tubular, and tubulovillous/villous) that are included in the US Multisociety Task Force guidelines for colorectal cancer risk assessment and surveillance. We developed multiple deep-learning approaches by leveraging a dataset of 2074 crop images, which were annotated by multiple domain expert pathologists as reference standards. Statistical Analysis: We evaluated our method on an independent test set of 239 whole-slide images and measured standard machine-learning evaluation metrics of accuracy, precision, recall, and F1 score and their 95% confidence intervals. Results: Our evaluation shows that our method with residual network architecture achieves the best performance for classification of colorectal polyps on whole-slide images (overall accuracy: 93.0%, 95% confidence interval: 89.0%–95.9%). Conclusions: Our method can reduce the cognitive burden on pathologists and improve their efficacy in histopathological characterization of colorectal polyps and in subsequent risk assessment and follow-up recommendations.

Journal of Web Semantics | 2014

Clustering rule bases using ontology-based similarity measures

Saeed Hassanpour; Martin J. O’Connor; Amar K. Das

Abstract Rules are increasingly becoming an important form of knowledge representation on the Semantic Web. There are currently few methods that can ensure that the acquisition and management of rules can scale to the size of the Web. We previously developed methods to help manage large rule bases using syntactical analyses of rules. This approach did not incorporate semantics. As a result, rule categorization based on syntactic features may not be effective. In this paper, we present a novel approach for grouping rules based on whether the rule elements share relationships within a domain ontology. We have developed our method for rules specified in the Semantic Web Rule Language (SWRL), which is based on the Web Ontology Language (OWL) and shares its formal underpinnings. Our method uses vector space modeling of rule atoms and an ontology-based semantic similarity measure. We apply a clustering method to detect rule relatedness, and we use a statistical model selection method to find the optimal number of clusters within a rule base. Using three different SWRL rule bases, we evaluated the results of our semantic clustering method against those of our syntactic approach. We have found that our new approach creates clusters that better match the rule bases’ logical structures. Semantic clustering of rule bases may help users to more rapidly comprehend, acquire, and manage the growing numbers of rules on the Semantic Web.

Journal of Biomedical Semantics | 2013

A semantic-based method for extracting concept definitions from scientific publications: evaluation in the autism phenotype domain

Saeed Hassanpour; Martin J. O’Connor; Amar K. Das

BackgroundA variety of informatics approaches have been developed that use information retrieval, NLP and text-mining techniques to identify biomedical concepts and relations within scientific publications or their sentences. These approaches have not typically addressed the challenge of extracting more complex knowledge such as biomedical definitions. In our efforts to facilitate knowledge acquisition of rule-based definitions of autism phenotypes, we have developed a novel semantic-based text-mining approach that can automatically identify such definitions within text.ResultsUsing an existing knowledge base of 156 autism phenotype definitions and an annotated corpus of 26 source articles containing such definitions, we evaluated and compared the average rank of correctly identified rule definition or corresponding rule template using both our semantic-based approach and a standard term-based approach. We examined three separate scenarios: (1) the snippet of text contained a definition already in the knowledge base; (2) the snippet contained an alternative definition for a concept in the knowledge base; and (3) the snippet contained a definition not in the knowledge base. Our semantic-based approach had a higher average rank than the term-based approach for each of the three scenarios (scenario 1: 3.8 vs. 5.0; scenario 2: 2.8 vs. 4.9; and scenario 3: 4.5 vs. 6.2), with each comparison significant at the p-value of 0.05 using the Wilcoxon signed-rank test.ConclusionsOur work shows that leveraging existing domain knowledge in the information extraction of biomedical definitions significantly improves the correct identification of such knowledge within sentences. Our method can thus help researchers rapidly acquire knowledge about biomedical definitions that are specified and evolving within an ever-growing corpus of scientific publications.

rules and rule markup languages for the semantic web | 2011

A framework for the automatic extraction of rules from online text

Saeed Hassanpour; Martin J. O'Connor; Amar K. Das

The majority of knowledge on the Web is encoded in unstructured text and is not linked to formalized knowledge, such as ontologies and rules. A potential solution to this problem is to acquire this knowledge through natural language processing and text mining methods. Prior work has focused on automatically extracting RDF- or OWL-based ontologies from text; however, the type of knowledge acquired is generally restricted to simple term hierarchies. This paper presents a general-purpose framework for acquiring more complex relationships from text and then encoding this knowledge as rules. Our approach starts with existing domain knowledge in the form of OWL ontologies and Semantic Web Rule Language (SWRL) rules and applies natural language processing and text matching techniques to deduce classes and properties. It then captures deductive knowledge in the form of new rules. We have evaluated our framework by applying it to web-based text on car rental requirements. We show that our approach can automatically and accurately generate rules for requirements of car rental companies not in the knowledge base. Our framework thus rapidly acquires complex knowledge from free text sources. We are expanding it to handle richer domains, such as medical science.

international semantic web conference | 2010

Visualizing logical dependencies in SWRL rule bases

Saeed Hassanpour; Martin J. O'Connor; Amar K. Das

Rule bases are common in many business rule applications, clinical decision support programs, and other types of intelligent systems. As the size of the rule bases grows and the interrelationships between rules become more complex, understanding dependencies among rules can be quite difficult. To address this challenge, we propose a novel approach for modeling logical dependencies among rules and for discovering patterns based on these dependencies. Our method uses rules bases written in the Semantic Web Rule Language (SWRL); we exploit SWRLs logical relationship with OWL to incorporate these semantics in our analysis. We couple this analysis with visualization techniques that create a rule dependency graph. We group nodes into layers based on their dependencies and cluster nodes within a layer if they have similar dependencies. We have evaluated our approach by applying it to two independently developed, publicly available ontologies containing SWRL rules. We show how our analysis and visualization approach can allow users to quickly examine patterns of logical relationships in such rule bases.

American Journal of Roentgenology | 2017

Performance of a Machine Learning Classifier of Knee MRI Reports in Two Large Academic Radiology Practices: A Tool to Estimate Diagnostic Yield

Saeed Hassanpour; Curtis P. Langlotz; Timothy J. Amrhein; Nicholas T. Befera; Matthew P. Lungren

OBJECTIVE The purpose of this study is to evaluate the performance of a natural language processing (NLP) system in classifying a database of free-text knee MRI reports at two separate academic radiology practices. MATERIALS AND METHODS An NLP system that uses terms and patterns in manually classified narrative knee MRI reports was constructed. The NLP system was trained and tested on expert-classified knee MRI reports from two major health care organizations. Radiology reports were modeled in the training set as vectors, and a support vector machine framework was used to train the classifier. A separate test set from each organization was used to evaluate the performance of the system. We evaluated the performance of the system both within and across organizations. Standard evaluation metrics, such as accuracy, precision, recall, and F1 score (i.e., the weighted average of the precision and recall), and their respective 95% CIs were used to measure the efficacy of our classification system. RESULTS The accuracy for radiology reports that belonged to the models clinically significant concept classes after training data from the same institution was good, yielding an F1 score greater than 90% (95% CI, 84.6-97.3%). Performance of the classifier on cross-institutional application without institution-specific training data yielded F1 scores of 77.6% (95% CI, 69.5-85.7%) and 90.2% (95% CI, 84.5-95.9%) at the two organizations studied. CONCLUSION The results show excellent accuracy by the NLP machine learning classifier in classifying free-text knee MRI reports, supporting the institution-independent reproducibility of knee MRI report classification. Furthermore, the machine learning classifier performed well on free-text knee MRI reports from another institution. These data support the feasibility of multiinstitutional classification of radiologic imaging text reports with a single machine learning classifier without requiring institution-specific training data.

Journal of Digital Imaging | 2016

Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository.

Saeed Hassanpour; Curtis P. Langlotz

Radiology report narrative contains a large amount of information about the patient’s health and the radiologist’s interpretation of medical findings. Most of this critical information is entered in free text format, even when structured radiology report templates are used. The radiology report narrative varies in use of terminology and language among different radiologists and organizations. The free text format and the subtlety and variations of natural language hinder the extraction of reusable information from radiology reports for decision support, quality improvement, and biomedical research. Therefore, as the first step to organize and extract the information content in a large multi-institutional free text radiology report repository, we have designed and developed an unsupervised machine learning approach to capture the main concepts in a radiology report repository and partition the reports based on their main foci. In this approach, radiology reports are modeled in a vector space and compared to each other through a cosine similarity measure. This similarity is used to cluster radiology reports and identify the repository’s underlying topics. We applied our approach on a repository of 1,899,482 radiology reports from three major healthcare organizations. Our method identified 19 major radiology report topics in the repository and clustered the reports accordingly to these topics. Our results are verified by a domain expert radiologist and successfully explain the repository’s primary topics and extract the corresponding reports. The results of our system provide a target-based corpus and framework for information extraction and retrieval systems for radiology reports.

computer vision and pattern recognition | 2017

Looking Under the Hood: Deep Neural Network Visualization to Interpret Whole-Slide Image Analysis Outcomes for Colorectal Polyps

Bruno Korbar; Andrea M. Olofson; Allen P. Miraflor; Catherine M. Nicka; Matthew A. Suriawinata; Lorenzo Torresani; Arief A. Suriawinata; Saeed Hassanpour

Histopathological characterization of colorectal polyps is an important principle for determining the risk of colorectal cancer and future rates of surveillance for patients. The process of characterization is time-intensive and requires years of specialized medical training. In this work, we propose a deep-learning-based image analysis approach that not only can accurately classify different types of polyps in whole-slide images, but also generates major regions and features on the slide through a model visualization approach. We argue that this visualization approach will make sense of the underlying reasons for the classification outcomes, significantly reduce the cognitive burden on clinicians, and improve the diagnostic accuracy for whole-slide image characterization tasks. Our results show the efficacy of this network visualization approach in recovering decisive regions and features for different types of polyps on whole-slide images according to the domain expert pathologists.

Explore More