Oana Frunza | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Oana Frunza is active.

Explore More

Publication

Featured researches published by Oana Frunza.

Journal of the American Medical Informatics Association | 2010

A new algorithm for reducing the workload of experts in performing systematic reviews

Stan Matwin; Alexandre Kouznetsov; Diana Inkpen; Oana Frunza; Peter O'Blenis

OBJECTIVE To determine whether a factorized version of the complement naïve Bayes (FCNB) classifier can reduce the time spent by experts reviewing journal articles for inclusion in systematic reviews of drug class efficacy for disease treatment. DESIGN The proposed classifier was evaluated on a test collection built from 15 systematic drug class reviews used in previous work. The FCNB classifier was constructed to classify each article as containing high-quality, drug class-specific evidence or not. Weight engineering (WE) techniques were added to reduce underestimation for Medical Subject Headings (MeSH)-based and Publication Type (PubType)-based features. Cross-validation experiments were performed to evaluate the classifiers parameters and performance. MEASUREMENTS Work saved over sampling (WSS) at no less than a 95% recall was used as the main measure of performance. RESULTS The minimum workload reduction for a systematic review for one topic, achieved with a FCNB/WE classifier, was 8.5%; the maximum was 62.2% and the average over the 15 topics was 33.5%. This is 15.0% higher than the average workload reduction obtained using a voting perceptron-based automated citation classification system. CONCLUSION The FCNB/WE classifier is simple, easy to implement, and produces significantly better results in reducing the workload than previously achieved. The results support it being a useful algorithm for machine-learning-based automation of systematic reviews of drug class efficacy for disease treatment.

IEEE Transactions on Knowledge and Data Engineering | 2011

A Machine Learning Approach for Identifying Disease-Treatment Relations in Short Texts

Oana Frunza; Diana Inkpen; Thomas T. Tran

The Machine Learning (ML) field has gained its momentum in almost any domain of research and just recently has become a reliable tool in the medical domain. The empirical domain of automatic learning is used in tasks such as medical decision support, medical imaging, protein-protein interaction, extraction of medical knowledge, and for overall patient management care. ML is envisioned as a tool by which computer-based systems can be integrated in the healthcare field in order to get a better, more efficient medical care. This paper describes a ML-based methodology for building an application that is capable of identifying and disseminating healthcare information. It extracts sentences from published medical papers that mention diseases and treatments, and identifies semantic relations that exist between diseases and treatments. Our evaluation results for these tasks show that the proposed methodology obtains reliable outcomes that could be integrated in an application to be used in the medical care domain. The potential value of this paper stands in the ML settings that we propose and in the fact that we outperform previous results on the same data set.

Artificial Intelligence in Medicine | 2011

Exploiting the systematic review protocol for classification of medical abstracts

Oana Frunza; Diana Inkpen; Stan Matwin; William Klement; Peter O'Blenis

OBJECTIVE To determine whether the automatic classification of documents can be useful in systematic reviews on medical topics, and specifically if the performance of the automatic classification can be enhanced by using the particular protocol of questions employed by the human reviewers to create multiple classifiers. METHODS AND MATERIALS The test collection is the data used in large-scale systematic review on the topic of the dissemination strategy of health care services for elderly people. From a group of 47,274 abstracts marked by human reviewers to be included in or excluded from further screening, we randomly selected 20,000 as a training set, with the remaining 27,274 becoming a separate test set. As a machine learning algorithm we used complement naïve Bayes. We tested both a global classification method, where a single classifier is trained on instances of abstracts and their classification (i.e., included or excluded), and a novel per-question classification method that trains multiple classifiers for each abstract, exploiting the specific protocol (questions) of the systematic review. For the per-question method we tested four ways of combining the results of the classifiers trained for the individual questions. As evaluation measures, we calculated precision and recall for several settings of the two methods. It is most important not to exclude any relevant documents (i.e., to attain high recall for the class of interest) but also desirable to exclude most of the non-relevant documents (i.e., to attain high precision on the class of interest) in order to reduce human workload. RESULTS For the global method, the highest recall was 67.8% and the highest precision was 37.9%. For the per-question method, the highest recall was 99.2%, and the highest precision was 63%. The human-machine workflow proposed in this paper achieved a recall value of 99.6%, and a precision value of 17.8%. CONCLUSION The per-question method that combines classifiers following the specific protocol of the review leads to better results than the global method in terms of recall. Because neither method is efficient enough to classify abstracts reliably by itself, the technology should be applied in a semi-automatic way, with a human expert still involved. When the workflow includes one human expert and the trained automatic classifier, recall improves to an acceptable level, showing that automatic classification techniques can reduce the human workload in the process of building a systematic review.

canadian conference on artificial intelligence | 2009

Classifying Biomedical Abstracts Using Committees of Classifiers and Collective Ranking Techniques

Alexandre Kouznetsov; Stan Matwin; Diana Inkpen; Amir Hossein Razavi; Oana Frunza; Morvarid Sehatkar; Leanne Seaward; Peter O'Blenis

The purpose of this work is to reduce the workload of human experts in building systematic reviews from published articles, used in evidence-based medicine. We propose to use a committee of classifiers to rank biomedical abstracts based on the predicted relevance to the topic under review. In our approach, we identify two subsets of abstracts: one that represents the top, and another that represents the bottom of the ranked list. These subsets, identified using machine learning (ML) techniques, are considered zones where abstracts are labeled with high confidence as relevant or irrelevant to the topic of the review. Early experiments with this approach using different classifiers and different representation techniques show significant workload reduction.

canadian conference on artificial intelligence | 2011

Extracting relations between diseases, treatments, and tests from clinical data

Oana Frunza; Diana Inkpen

This paper describes research methodologies and experimental settings for the task of relation identification and classification between pairs of medical entities, using clinical data. The models that we use represent a combination of lexical and syntactic features, medical semantic information, terms extracted from a vector-space model created using a random projection algorithm, and additional contextual information extracted at sentence-level. The best results are obtained using an SVM classification algorithm with a combination of the above mentioned features, plus a set of additional features that capture the distributional semantic correlation between the concepts and each relation of interest.

meeting of the association for computational linguistics | 2006

Semi-Supervised Learning of Partial Cognates Using Bilingual Bootstrapping

Oana Frunza; Diana Inkpen

Partial cognates are pairs of words in two languages that have the same meaning in some, but not all contexts. Detecting the actual meaning of a partial cognate in context can be useful for Machine Translation tools and for Computer-Assisted Language Learning tools. In this paper we propose a supervised and a semi-supervised method to disambiguate partial cognates between two languages: French and English. The methods use only automatically-labeled data; therefore they can be applied for other pairs of languages as well. We also show that our methods perform well when using corpora from different domains.

language resources and evaluation | 2008

Disambiguation of partial cognates

Oana Frunza; Diana Inkpen

Partial cognates are pairs of words in two languages that have the same meaning in some, but not all contexts. Detecting the actual meaning of a partial cognate in context can be useful for Machine Translation tools and for Computer-Assisted Language Learning tools. We propose a supervised and a semi-supervised method to disambiguate partial cognates between two languages: French and English. The methods use only automatically-labeled data; therefore they can be applied to other pairs of languages as well. The aim of our work is to automatically detect the meaning of a French partial cognate word in a specific context.

Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing | 2008

Textual Information for Predicting Functional Properties of the Genes

Oana Frunza; Diana Inkpen

This paper is focused on determining which proteins affect the activity of Aryl Hydrocarbon Receptor (AHR) system when learning a model that can accurately predict its activity when single genes are knocked out. Experiments with results are presented when models are trained on a single source of information: abstracts from Medline (http://medline.cos.com/) that talk about the genes involved in the experiments. The results suggest that AdaBoost classifier with a binary bag-of-words representation obtains significantly better results.

recent advances in natural language processing | 2005