Amir Hossein Razavi
University of Ottawa
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Amir Hossein Razavi.
canadian conference on artificial intelligence | 2010
Amir Hossein Razavi; Diana Inkpen; Sasha Uritsky; Stan Matwin
Text messaging through the Internet or cellular phones has become a major medium of personal and commercial communication In the same time, flames (such as rants, taunts, and squalid phrases) are offensive/abusive phrases which might attack or offend the users for a variety of reasons An automatic discriminative software with a sensitivity parameter for flame or abusive language detection would be a useful tool Although a human could recognize these sorts of useless annoying texts among the useful ones, it is not an easy task for computer programs In this paper, we describe an automatic flame detection method which extracts features at different conceptual levels and applies multi-level classification for flame detection While the system is taking advantage of a variety of statistical models and rule-based patterns, there is an auxiliary weighted pattern repository which improves accuracy by matching the text to its graded entries.
canadian conference on artificial intelligence | 2009
Alexandre Kouznetsov; Stan Matwin; Diana Inkpen; Amir Hossein Razavi; Oana Frunza; Morvarid Sehatkar; Leanne Seaward; Peter O'Blenis
The purpose of this work is to reduce the workload of human experts in building systematic reviews from published articles, used in evidence-based medicine. We propose to use a committee of classifiers to rank biomedical abstracts based on the predicted relevance to the topic under review. In our approach, we identify two subsets of abstracts: one that represents the top, and another that represents the bottom of the ranked list. These subsets, identified using machine learning (ML) techniques, are considered zones where abstracts are labeled with high confidence as relevant or irrelevant to the topic of the review. Early experiments with this approach using different classifiers and different representation techniques show significant workload reduction.
international conference on data mining | 2009
Amir Hossein Razavi; Stan Matwin; Diana Inkpen; Alexandre Kouznetsov
In this article, we present a novel statistical representation method for knowledge extraction from a corpus containing short texts. Then we introduce the contrast parameter which could be adjusted for targeting different conceptual levels in text mining and knowledge extraction. The method is based on second order co-occurrence vectors whose efficiency for representing meaning has been established in many applications, especially for representing word senses in different contexts and for disambiguation purposes. We evaluate our method on two tasks: classification of textual description of dreams, and classification of medical abstracts for systematic reviews.
canadian conference on artificial intelligence | 2013
Amir Hossein Razavi; Diana Inkpen; Dmitry Brusilovsky; Lana Bogouslavski
In this article, we present a novel document annotation method that can be applied on corpora containing short documents such as social media texts. The method applies Latent Dirichlet Allocation (LDA) on a corpus to initially infer some topical word clusters. Each document is assigned one or more topic clusters automatically. Further document annotation is done through a projection of the topics extracted and assigned by LDA into a set of generic categories. The translation from the topical clusters to the small set of generic categories is done manually. Then the categories are used to automatically annotate the general topics of the documents. It is remarkable that the number of the topical clusters that need to be manually mapped to the general topics is far smaller than the number of postings of a corpus that normally need to be annotated to build training and testing sets manually. We show that the accuracy of the annotation done through this method is about 80% which is comparable with inter-human agreement in similar tasks. Additionally, using the LDA method, the corpus entries are represented by low-dimensional vectors which lead to good classification results. The lower-dimensional representation can be fed into many machine learning algorithms that cannot be applied on the conventional high-dimensional text representation methods.
canadian conference on artificial intelligence | 2014
Amir Hossein Razavi; Diana Inkpen
We introduce a novel text representation method to be applied on corpora containing short / medium length textual documents. The method applies Latent Dirichlet Allocation (LDA) on a corpus to infer its major topics, which will be used for document representation. The representation that we propose has multiple levels (granularities) by using different numbers of topics. We postulate that interpreting data in a more general space, with fewer dimensions, can improve the representation quality. Experimental results support the informative power of our multi-level representation vectors. We show that choosing the correct granularity of representation is an important aspect of text classification. We propose a multi-level representation, at different topical granularities, rather than choosing one level. The documents are represented by topical relevancy weights, in a low-dimensional vector representation. Finally, the proposed representation is applied to a text classification task using several well-known classification algorithms. We show that it leads to very good classification performance. Another advantage is that, with a small compromise on accuracy, our low-dimensional representation can be fed into many supervised or unsupervised machine learning algorithms that empirically cannot be applied on the conventional high-dimensional text representation methods.
Journal of Internal Medicine | 2018
G S Handelman; H K Kok; Ronil V. Chandra; Amir Hossein Razavi; Michael J. Lee; Hamed Asadi
Machine learning (ML) is a burgeoning field of medicine with huge resources being applied to fuse computer science and statistics to medical problems. Proponents of ML extol its ability to deal with large, complex and disparate data, often found within medicine and feel that ML is the future for biomedical research, personalized medicine, computer‐aided diagnosis to significantly advance global health care. However, the concepts of ML are unfamiliar to many medical professionals and there is untapped potential in the use of ML as a research tool. In this article, we provide an overview of the theory behind ML, explore the common ML algorithms used in medicine including their pitfalls and discuss the potential future of ML in medicine.
Procedia Computer Science | 2014
Kambiz Ghazinour; Amir Hossein Razavi; Ken Barker
Abstract Privacy concerns exist whenever sensitive data relating to people is collected. Finding a way to preserve and guarantee an individuals privacy has always been of high importance. Some may decide not to reveal their data to protect their privacy. It has become impossible to take advantage of many essential customized services without disclosing any identifying or sensitive data. The challenge is that each data item may have a different value for different individuals. These values can be defined by applying weights that describe the importance of data items for individuals if that particular private data item is exposed. We propose a generic framework to capture these weights from data providers, which can be considered as a mediator to quantify privacy compromisation. This framework also helps us to identify what portion of a targeted population is vulnerable to compromise their privacy in return for receiving certain incentives. Conversely, the model could assist researchers to offer appropriate incentives to a targeted population to facilitate collecting useful data.
computer based medical systems | 2013
Amir Hossein Razavi; Kambiz Ghazinour
This paper describes our study of the incidence of Personal Health Information (PHI) on the Web. PHI is usually shared under conditions of confidentiality, protection and trust, and should not be disclosed or available to unrelated third parties or the general public. We first analyzed the characteristics that potentially make systems successful in identification of unsolicited or unjustified PHI disclosures. In the next stage, we designed and implemented an integrated Natural Language Processing/Machine Learning (NLP/ML)-based system that detects disclosures of personal health information, specifically according to the above characteristics including detected patterns. This research is regarded as the first step toward a learning system that will be trained based on a limited training set built on the result of the processing chain described in the paper in order to generally detect the PHI disclosures over the web.
2014 IEEE International Inter-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA) | 2014
Amir Hossein Razavi; Diana Inkpen; Rafael Falcon; Rami S. Abielmona
intelligent information systems | 2014
Amir Hossein Razavi; Stan Matwin; Joseph De Koninck; Ray Reza Amini