John Cuzzola | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John Cuzzola is active.

Explore More

Publication

Featured researches published by John Cuzzola.

It Professional | 2014

Automated Semantic Tagging of Textual Content

Jelena Jovanovic; Ebrahim Bagheri; John Cuzzola; Dragan Gasevic; Zoran Jeremic; Reza Bashash

Motivated by a continually increasing demand for applications that depend on machine comprehension of text-based content, researchers in both academia and industry have developed innovative solutions for automated information extraction from text. In this article, the authors focus on a subset of such tools--semantic taggers--that not only extract and disambiguate entities mentioned in the text but also identify topics that unambiguously describe the texts main themes. The authors offer insight into the process of semantic tagging, the capabilities and specificities of todays semantic taggers, and also indicate some of the criteria to be considered when choosing a tagger.

canadian conference on artificial intelligence | 2012

Exploiting semantic roles for asynchronous question answering in an educational setting

Dunwei Wen; John Cuzzola; Lorna M. Brown; Kinshuk

Recent question answering (QA) research has started to incorporate deep natural language processing (NLP) such as syntactic and semantic parsing in order to enhance the capability of selecting the most relevant answers to a given question. However, current NLP technology involves intensive computing and thus hard to meet the real-time demand of synchronous QA. To improve e-learning we introduce NLP into a QA system that specifically exploits the communication latency between student and instructor. We present how the system will fit for educational environment, and how semantic similarity matching between a question and its candidate answers can be improved by semantic roles. The designed system and its running results show the perspective and potential of this research.

Applied Soft Computing | 2015

Automated classification and localization of daily deal content from the Web

John Cuzzola; Jelena Jovanovic; Ebrahim Bagheri; Dragan Gasevic

The identification of effective and computationally inexpensive set of features that are able to discriminate between Web pages that consist of daily deal information and those that do not.The construction and systematic comparison of different learning machines based on the identified features that can efficiently classify pages as containing daily deal information with emphasis on both recall and precision.The development of a segmentation algorithm capable of determining the number of daily deals present on a given Web page, and accurately localizing those segments of the page that consist of information about one particular deal. Websites offering daily deal offers have received widespread attention from the end-users. The objective of such Websites is to provide time limited discounts on goods and services in the hope of enticing more customers to purchase such goods or services. The success of daily deal Websites has given rise to meta-level daily deal aggregator services that collect daily deal information from across the Web. Due to some of the unique characteristics of daily deal Websites such as high update frequency, time sensitivity, and lack of coherent information representation, many deal aggregators rely on human intervention to identify and extract deal information. In this paper, we propose an approach where daily deal information is identified, classified and properly segmented and localized. Our approach is based on a semi-supervised method that uses sentence-level features of daily deal information on a given Web page. Our work offers (i) a set of computationally inexpensive discriminative features that are able to effectively distinguish Web pages that contain daily deal information; (ii) the construction and systematic evaluation of machine learning techniques based on these features to automatically classify daily deal Web pages; and (iii) the development of an accurate segmentation algorithm that is able to localize and extract individual deals from within a complex Web page. We have extensively evaluated our approach from different perspectives, the results of which show notable performance.

database and expert systems applications | 2015

Filtering Inaccurate Entity Co-references on the Linked Open Data

John Cuzzola; Ebrahim Bagheri; Jelena Jovanovic

The Linked Open Data LOD initiative relies heavily on the interconnections between different open RDF datasets where RDF links are used to connect resources. There has already been substantial research on identifying identity links between resources from different datasets, a process that is often referred to as co-reference resolution. These techniques often rely on probabilistic models or inference mechanisms to detect identity relations. However, recent studies have shown considerable inaccuracies in the LOD datasets that pertain to identity relations, e.g., owl:sameAs relations. In this paper, we propose a technique that evaluates existing identity links between LOD resources and identifies potentially erroneous links. Our work relies on the position and relevance of each resource with regards to the associated DBpedia categories modeled through two probabilistic category distribution and selection functions. Our experimental results show that our work is able to semantically distinguish inaccurate identity links even in cases when high syntactical similarity is observed between two resources.

Journal of Biomedical Informatics | 2017

RysannMD: A Biomedical Semantic Annotator Balancing Speed and Accuracy.

John Cuzzola; Jelena Jovanovic; Ebrahim Bagheri

Recently, both researchers and practitioners have explored the possibility of semantically annotating large and continuously evolving collections of biomedical texts such as research papers, medical reports, and physician notes in order to enable their efficient and effective management and use in clinical practice or research laboratories. Such annotations can be automatically generated by biomedical semantic annotators - tools that are specifically designed for detecting and disambiguating biomedical concepts mentioned in text. The biomedical community has already presented several solid automated semantic annotators. However, the existing tools are either strong in their disambiguation capacity, i.e., the ability to identify the correct biomedical concept for a given piece of text among several candidate concepts, or they excel in their processing time, i.e., work very efficiently, but none of the semantic annotation tools reported in the literature has both of these qualities. In this paper, we present RysannMD (Ryerson Semantic Annotator for Medical Domain), a biomedical semantic annotation tool that strikes a balance between processing time and performance while disambiguating biomedical terms. In other words, RysannMD provides reasonable disambiguation performance when choosing the right sense for a biomedical term in a given context, and does that in a reasonable time. To examine how RysannMD stands with respect to the state of the art biomedical semantic annotators, we have conducted a series of experiments using standard benchmarking corpora, including both gold and silver standards, and four modern biomedical semantic annotators, namely cTAKES, MetaMap, NOBLE Coder, and Neji. The annotators were compared with respect to the quality of the produced annotations measured against gold and silver standards using precision, recall, and F1 measure and speed, i.e., processing time. In the experiments, RysannMD achieved the best median F1 measure across the benchmarking corpora, independent of the standard used (silver/gold), biomedical subdomain, and document size. In terms of the annotation speed, RysannMD scored the second best median processing time across all the experiments. The obtained results indicate that RysannMD offers the best performance among the examined semantic annotators when both quality of annotation and speed are considered simultaneously.

international conference on machine learning and applications | 2011

Fault Detection through Sequential Filtering of Novelty Patterns

John Cuzzola; Dragan Gasevic; Ebrahim Bagheri

Multi-threaded applications are commonplace in todays software landscape. Pushing the boundaries of concurrency and parallelism, programmers are maximizing performance demanded by stakeholders. However, multi-threaded programs are challenging to test and debug. Prone to their own set of unique faults, such as race conditions, testers need to turn to automated validation tools for assistance. This papers main contribution is a new algorithm called multi-stage novelty filtering (MSNF) that can aid in the discovery of software faults. MSNF stresses minimal configuration, no domain specific data preprocessing or software metrics. The MSNF approach is based on a multi-layered support vector machine scheme. After experimentation with the MSNF algorithm, we observed promising results in terms of precision. However, MSNF relies on multiple iterations (i.e., stages). Here, we propose four different strategies for estimating the number of the requested stages.

Journal of the American Medical Informatics Association | 2018

UMLS to DBPedia link discovery through circular resolution

John Cuzzola; Ebrahim Bagheri; Jelena Jovanovic

Objective The goal of this work is to map Unified Medical Language System (UMLS) concepts to DBpedia resources using widely accepted ontology relations from the Simple Knowledge Organization System (skos:exactMatch, skos:closeMatch) and from the Resource Description Framework Schema (rdfs:seeAlso), as a result of which a complete mapping from UMLS (UMLS 2016AA) to DBpedia (DBpedia 2015-10) is made publicly available that includes 221 690 skos:exactMatch, 26 276 skos:closeMatch, and 6 784 322 rdfs:seeAlso mappings. Methods We propose a method called circular resolution that utilizes a combination of semantic annotators to map UMLS concepts to DBpedia resources. A set of annotators annotate definitions of UMLS concepts returning DBpedia resources while another set performs annotation on DBpedia resource abstracts returning UMLS concepts. Our pipeline aligns these 2 sets of annotations to determine appropriate mappings from UMLS to DBpedia. Results We evaluate our proposed method using structured data from the Wikidata knowledge base as the ground truth, which consists of 4899 already existing UMLS to DBpedia mappings. Our results show an 83% recall with 77% precision-at-one (P@1) in mapping UMLS concepts to DBpedia resources on this testing set. Conclusions The proposed circular resolution method is a simple yet effective technique for linking UMLS concepts to DBpedia resources. Experiments using Wikidata-based ground truth reveal a high mapping accuracy. In addition to the complete UMLS mapping downloadable in n-triple format, we provide an online browser and a RESTful service to explore the mappings.

International Journal of Semantic Computing | 2016

Semantic Disambiguation and Linking of Quantitative Mentions in Textual Content

Mehrnaz Ghashghaei; Ebrahim Bagheri; John Cuzzola; Ali A. Ghorbani; Zeinab Noorian

Semantic annotation techniques provide the basis for linking textual content with concepts in well grounded knowledge bases. In spite of their many application areas, current semantic annotation systems have some limitations. One of the prominent limitations of such systems is that none of the existing semantic annotator systems are able to identify and disambiguate quantitative (numerical) content. In textual documents such as Web pages, specially technical contents, there are many quantitative information such as product specifications that need to be semantically qualified. In this paper, we propose an approach for annotating quantitative values in short textual content. In our approach, we identify numeric values in the text and link them to an existing property in a knowledge base. Based on this mapping, we are then able to find the concept that the property is associated with, whereby identifying both the concept and the specific property of that concept that the numeric value belongs to. Results obtained from the developed gold standard dataset show that the proposed automated semantic annotation platform is quite effective in detecting and disambiguating numerical content, and connecting them to associated properties on the external knowledge base. Our experiments show that our proposed approach is able to reach an accuracy of over 70% for semantically annotating quantitative content.

Expert Systems With Applications | 2015