Javed Mostafa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Javed Mostafa is active.

Explore More

Publication

Featured researches published by Javed Mostafa.

ACM Transactions on Information Systems | 1997

A multilevel approach to intelligent information filtering: model, system, and evaluation

Javed Mostafa; Snehasis Mukhopadhyay; Mathew J. Palakal; Wai Lam

In information-filtering environments, uncertainties associated with changing interests of the user and the dynamic document stream must be handled efficiently. In this article, a filtering model is proposed that decomposes the overall task into subsystem functionalities and highlights the need for multiple adaptation techniques to cope with uncertainties. A filtering system, SIFTER, has been implemented based on the model, using established techniques in information retrieval and artificial intelligence. These techniques include document representation by a vector-space model, document classification by unsupervised learning, and user modeling by reinforcement learning. The system can filter information based on content and a users specific interests. The users interests are automatically learned with only limited user intervention in the form of optional relevance feedback for documents. We also describe experimental studies conducted with SIFTER to filter computer and information science documents collected from the Internet and commercial database services. The experimental results demonstrate that the system performs very well in filtering documents in a realistic problem setting.

international acm sigir conference on research and development in information retrieval | 1996

Detection of shifts in user interests for personalized information filtering

Wai Lam; Snehasis Mukhopadhyay; Javed Mostafa; Mathew J. Palakal

Detection of Shifts in User Interests for Personalized Information Filtering W. Lam*, S. Mukhopadhyay, J. Mostafa**, and M. Palakal Computer and Information Science Purdue University School of Science at Indianapolis 723 W. Michigan St. SL280 Indianapolis, IN 46202 *Department of Management Sciences S306 Pappajohn Building The University of Iowa Iowa City, Iowa 52242-1000 **School of Library and Information

Information Processing and Management | 2000

Automatic classification using supervised learning in a medical document filtering application

Javed Mostafa; Wai Lam

Document classifiers can play an intermediate role in multilevel filtering systems. The effectiveness of a classifier that uses supervised learning was analyzed in terms of its accuracy and ultimately its influence on filtering. The analysis was conducted in two phases. In the first phase, a multilayer feed-forward neural network was trained to classify medical documents in the area of cell biology. The accuracy of the supervised classifier was established by comparing its performance with a baseline system that uses human classification information. A relatively high degree of accuracy was achieved by the supervised method, however, classification accuracy varied across classes. In the second phase, to clarify the impact of this performance on filtering, different types of user profiles were created by grouping subsets of classes based on their individual classification accuracy rates. Then, a filtering system with the neural network integrated into it was used to filter the medical documents and this performance was compared with the filtering results achieved using the baseline system. The performance of the system using the neural network classifier was generally satisfactory and, as expected, the filtering performance varied with regard to the accuracy rates of classes.

Journal of the Association for Information Science and Technology | 2001

Modeling user interest shift using a Bayesian approach

Wai Lam; Javed Mostafa

We investigate the modeling of changes in user interest in information filtering systems. A new technique for tracking user interest shifts based on a Bayesian approach is developed. The interest tracker is integrated into a profile learning module of a filtering system. We present an analytical study to establish the rate of convergence for the profile learning with and without the user interest tracking component. We examine the relationship among degree of shift, cost of detection error, and time needed for detection. To study the effect of different patterns of interest shift on system performance we also conducted several filtering experiments. Generally, the findings show that the Bayesian approach is a feasible and effective technique for modeling user interest shift.

international acm sigir conference on research and development in information retrieval | 2005

An application of text categorization methods to gene ontology annotation

Kazuhiro Seki; Javed Mostafa

This paper describes an application of IR and text categorization methods to a highly practical problem in biomedicine, specifically, Gene Ontology (GO) annotation. GO annotation is a major activity in most model organism database projects and annotates gene functions using a controlled vocabulary. As a first step toward automatic GO annotation, we aim to assign GO domain codes given a specific gene and an article in which the gene appears, which is one of the task challenges at the TREC 2004 Genomics Track. We approached the task with careful consideration of the specialized terminology and paid special attention to dealing with various forms of gene synonyms, so as to exhaustively locate the occurrences of the target gene. We extracted the words around the gene occurrences and used them to represent the gene for GO domain code annotation. As a classifier, we adopted a variant of k-Nearest Neighbor (kNN) with supervised term weighting schemes to improve the performance, making our method among the top-performing systems in the TREC official evaluation. Moreover, it is demonstrated that our proposed framework is successfully applied to another task of the Genomics Track, showing comparable results to the best performing system.

Journal of the Association for Information Science and Technology | 1998

Filtering medical documents using automated and human classification methods

Javed Mostafa; Luz Marina Quiroga; Mathew Palakal

The goal of this research is to clarify the role of document classification in information filtering. An important function of classification, in managing computational complexity, is described and illustrated in the context of an existing filtering system. A parameter called classification homogeneity is presented for analyzing unsupervised automated classification by employing human classification as a control. Two significant components of the automated classification approach, vocabulary discovery and classification scheme generation, are described in detail. Results of classification performance revealed considerable variability in the homogeneity of automatically produced classes. Based on the classification performance, different types of interest profiles were created. Subsequently, these profiles were used to perform filtering sessions. The filtering results showed that with increasing homogeneity, filtering performance improves, and, conversely, with decreasing homogeneity, filtering performance degrades.

acm international conference on digital libraries | 1999

Empirical evaluation of explicit versus implicit acquisition of user profiles in information filtering systems

Luz Marina Quiroga; Javed Mostafa

INTRODUCTION To make digital libraries attractive and encourage use, new and value-added services are needed beyond conventional distribution and access mechanisms. An exciting area of development is information personalization services that route, recommend, sort and prune documents (henceforth collectively called filtering) based on users’ interest profiles. Significant advances have been made in filtering systems. However, few studies have considered how different approaches of acquiring profiles can influence filtering effectiveness. Profiles are at the center of our research and one of the issues we are focussing in is profile acquisition by a filtering system that provides general health information.

pacific symposium on biocomputing | 2006

Discovering implicit associations between genes and hereditary diseases.

Kazuhiro Seki; Javed Mostafa

We propose an approach to predicting implicit gene-disease associations based on the inference network, whereby genes and diseases are represented as nodes and are connected via two types of intermediate nodes: gene functions and phenotypes. To estimate the probabilities involved in the model, two learning schemes are compared; one baseline using co-annotations of keywords and the other taking advantage of free text. Additionally, we explore the use of domain ontologies to complement data sparseness and examine the impact of full text documents. The validity of the proposed framework is demonstrated on the benchmark data set created from real-world data.

computational systems bioinformatics | 2003

A probabilistic model for identifying protein names and their name boundaries

Kazuhiro Seki; Javed Mostafa

This paper proposes a method for identifying protein names in biomedical texts with an emphasis on detecting protein name boundaries. We use a probabilistic model which exploits several surface clues characterizing protein names and incorporates word classes for generalization. In contrast to previously proposed methods, our approach does not rely on natural language processing tools such as part-of-speech taggers and syntactic parsers, so as to reduce processing overhead and the potential number of probabilistic parameters to be estimated. A notion of certainty is also proposed to improve precision for identification. We implemented a protein name identification system based on our proposed method, and evaluated the system on real-world biomedical texts in conjunction with the previous work. The results showed that overall our system performs comparably to the state-of-the-art protein name identification system and that higher performance is achieved for compound names. In addition, it is demonstrated that our system can further improve precision by restricting the system output to those names with high certainties.

international parallel and distributed processing symposium | 2001

A comparison between single-agent and multi-agent classification of documents

Shengquan Peng; Snehasis Mukhopadhyay; Rajeev R. Raje; Mathew J. Palakal; Javed Mostafa

Information services such as searching, retrieval, and filtering are playing a dominant role in our life during the current information age. One critical functionality of these information services is to obtain effective classification for input documents. Thesaurus(vocabulary)-based document representation followed by clustering constitutes a popular approach to document classification. However, two alternatives exist to construct the information classificationsystem. The first one uses a single, monolithic, huge thesaurus and classifies all documents by one centralized machine. The second one exploits distributed computing environmentsby allowing multiple agents with small thesauri to collaborate with each other over a computer network. The objective of this paper is to compare these two approaches (i.e., single-agent and multi-agent) in terms of various criteria including response time, quality of classification, and economic/privacy considerations. Two experimental studies, involving classification of Computer Science and Medline documents, are presented to compare the performanceof a single-agent system with that of a multi-agent system in real world settings. These results indicate that a collaborative multi-agent system constitutes a attractive methodologyfor classifying a large volume of information efficiently, when the thesaurus is large.

Explore More