Is this you? Create Your Porfile

Mehreen Saeed

National University of Computer and Emerging Sciences

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mehreen Saeed is active.

Explore More

Publication

Featured researches published by Mehreen Saeed.

Expert Systems With Applications | 2015

Relative discrimination criterion - A novel feature ranking method for text data

Abdur Rehman; Kashif Javed; Haroon Atique Babri; Mehreen Saeed

Discussed characteristics of text data.Indicated that term counts are being ignored to calculated term rank.Proposed new feature ranking algorithm (RDC) which considers term counts.Compared performance of RDC with four feature ranking metrics on four datasets.RDC show highest performance in 65% of the classification cases. High dimensionality of text data hinders the performance of classifiers making it necessary to apply feature selection for dimensionality reduction. Most of the feature ranking metrics for text classification are based on document frequencies (df) of a term in positive and negative classes. Considering only document frequencies to rank features favors terms frequently occurring in larger classes in unbalanced datasets. In this paper we introduce a new feature ranking metric termed as relative discrimination criterion (RDC), which takes document frequencies for each term count of a term into account while estimating the usefulness of a term. The performance of RDC is compared with four well known feature ranking metrics, information gain (IG), CHI squared (CHI), odds ratio (OR) and distinguishing feature selector (DFS) using support vector machines (SVM) and multinomial naive Bayes (MNB) classifiers on four benchmark datasets, namely Reuters, 20 Newsgroups and two subsets of Ohsumed dataset. Our results based on macro and micro F1 measures show that the performance of RDC is superior than the other four metrics in 65% of our experimental trials. Also, RDC attains highest macro and micro F1 values in 69% of the cases.

Neurocomputing | 2013

Machine learning using Bernoulli mixture models: Clustering, rule extraction and dimensionality reduction

Mehreen Saeed; Kashif Javed; Haroon Atique Babri

Probabilistic models are common in the machine learning community for representing and modeling data. In this paper we focus on a probabilistic model based upon Bernoulli mixture models to solve different types of problems in pattern recognition like feature selection, classification, dimensionality reduction and rule generation. We illustrate the effectiveness of Bernoulli mixture models by applying them to various real life datasets taken from different domains, and used as part of various machine learning challenges. Our algorithms, based upon Bernoulli mixture models, are not only simple and intuitive but have also proven to give accurate and good results.

Neurocomputing | 2014

Impact of a metric of association between two variables on performance of filters for binary data

Kashif Javed; Haroon Atique Babri; Mehreen Saeed

In the feature selection community, filters are quite popular. Design of a filter depends on two parameters, namely the objective function and the metric it employs for estimating the feature-to-class (relevance) and feature-to-feature (redundancy) association. Filter designers pay relatively more attention towards the objective function. But a poor metric can overshadow the goodness of an objective function. The metrics that have been proposed in the literature estimate the relevance and redundancy differently, thus raising the question: can the metric estimating the association between two variables improve the feature selection capability of a given objective function or in other words a filter. This paper investigates this question. Mutual information is the metric proposed for measuring the relevance and redundancy between the features for the mRMR filter [1] while the MBF filter [2] employs correlation coefficient. Symmetrical uncertainty, a variant of mutual information, is used by the fast correlation-based filter (FCBF) [3]. We carry out experiments on mRMR, MBF and FCBF filters with three different metrics (mutual information, correlation coefficient and diff-criterion) using three binary data sets and four widely used classifiers. We find that MBF@?s performance is much better if it uses diff-criterion rather than correlation coefficient while mRMR with diff-criterion demonstrates performance better or comparable to mRMR with mutual information. For the FCBF filter, the diff-criterion also exhibits results much better than mutual information.

Knowledge and Information Systems | 2014

The correctness problem: evaluating the ordering of binary features in rankings

Kashif Javed; Mehreen Saeed; Haroon Atique Babri

In machine learning, feature ranking (FR) algorithms are used to rank features by relevance to the class variable. FR algorithms are mostly investigated for the feature selection problem and less studied for the problem of ranking. This paper focuses on the latter. A question asked about the problem of ranking given in the terminology of FR is: as different FR criteria estimate the relationship between a feature and the class variable differently on a given data, can we determine which criterion better captures the “true” feature-to-class relationship and thus generates the most “correct” order of individual features? This is termed as the “correctness” problem. It requires a reference ordering against which the ranks assigned to features by a FR algorithm are directly compared. The reference ranking is generally unknown for real-life data. In this paper, we show through theoretical and empirical analysis that for two-class classification tasks represented with binary data, the ordering of binary features based on their individual predictive powers can be used as a benchmark. Thus, allowing us to test how correct is the ordering of a FR algorithm. Based on these ideas, an evaluation method termed as FR evaluation strategy (FRES) is proposed. Rankings of three different FR criteria (relief, mutual information, and the diff-criterion) are investigated on five artificially generated and four real-life binary data sets. The results indicate that FRES works equally good for synthetic and real-life data and the diff-criterion generates the most correct orderings for binary data.

international symposium on neural networks | 2014

Design of the first neuronal connectomics challenge: From imaging to connectivity

Isabelle Guyon; Demian Battaglia; Alice Guyon; Vincent Lemaire; Javier G. Orlandi; Bisakha Ray; Mehreen Saeed; Jordi Soriano; Alexander Statnikov; Olav Stetter

We are organizing a challenge to reverse engineer the structure of neuronal networks from patterns of activity recorded with calcium fluorescence imaging. Unraveling the brain structure at the neuronal level at a large scale is an important step in brain science, with many ramifications in the comprehension of animal and human intelligence and learning capabilities, as well as understanding and curing neuronal diseases and injuries. However, uncovering the anatomy of the brain by disentangling the neural wiring with its very fine and intertwined dendrites and axons, making both local and far reaching synapses, is a very arduous task: traditional methods of axonal tracing are tedious, difficult, and time consuming. This challenge proposes to approach the problem from a different angle, by reconstructing the effective connectivity of a neuronal network from observations of neuronal activity of thousands of neurons, which can be obtained with state-of-the-art fluorescence calcium imaging. To evaluate the effectiveness of proposed algorithms, we will use data obtained with a realistic simulator of real neurons for which we have ground truth of the neuronal connections. We produced simulated calcium imaging data, taking into account a model of fluorescence and light scattering. The task of the participants is to reconstruct a network of 1000 neurons from time series of neuronal activities obtained with this model. This challenge is part of the official selection of the WCCI 2014 competition program.

international symposium on neural networks | 2008

Classifiers based on Bernoulli mixture models for text mining and handwriting recognition tasks

Mehreen Saeed; Haroon Atique Babri

In this paper we describe a model for classifying binary data using classifiers based on Bernoulli mixture models. We show how Bernoulli mixtures can be used for feature extraction and dimensionality reduction of raw input data. The extracted features are then used for training a classifier for supervised labeling of individual sample points. We have applied this method to two different types of datasets, i.e., one from the text mining domain and one from the handwriting recognition area. Empirical experiments demonstrate that we can obtain up to 99.9% reduction in the dimensionality of the original feature set for sparse binary features. Classification accuracy also increases considerably when the combined model is used. This paper compares the performance of different classification algorithms when used in conjunction with the new feature set generated by Bernoulli mixtures. Using this hybrid model of learning we have achieved one of the best accuracy rates on the NOVA and GINA datasets of the dasiaagnostic vs. prior knowledgepsila competition held by the International Joint Conference on Neural Networks in 2007.

PLOS ONE | 2012

Reverse Engineering Boolean Networks: From Bernoulli Mixture Models to Rule Based Systems

Mehreen Saeed; Maliha Ijaz; Kashif Javed; Haroon Atique Babri

A Boolean network is a graphical model for representing and analyzing the behavior of gene regulatory networks (GRN). In this context, the accurate and efficient reconstruction of a Boolean network is essential for understanding the gene regulation mechanism and the complex relations that exist therein. In this paper we introduce an elegant and efficient algorithm for the reverse engineering of Boolean networks from a time series of multivariate binary data corresponding to gene expression data. We call our method ReBMM, i.e., reverse engineering based on Bernoulli mixture models. The time complexity of most of the existing reverse engineering techniques is quite high and depends upon the indegree of a node in the network. Due to the high complexity of these methods, they can only be applied to sparsely connected networks of small sizes. ReBMM has a time complexity factor, which is independent of the indegree of a node and is quadratic in the number of nodes in the network, a big improvement over other techniques and yet there is little or no compromise in accuracy. We have tested ReBMM on a number of artificial datasets along with simulated data derived from a plant signaling network. We also used this method to reconstruct a network from real experimental observations of microarray data of the yeast cell cycle. Our method provides a natural framework for generating rules from a probabilistic model. It is simple, intuitive and illustrates excellent empirical results.

international conference on electrical engineering | 2008

The behavior of k-Means: An empirical study

Kashif Javed; Haroon Atique Babri; Mehreen Saeed

In this paper, we study the behavior of the typical k-Means clustering algorithm by investigating the distributions of the final centroids, the sum-of-squares error and the iterations to convergence. This behavior is observed on two different synthetic data sets. It is found that when the clusters are well isolated from each other, the spread of the solutions found by k-Means algorithm indicates a much larger number of local minima as compared to the data set in which clusters overlap.

IEEE Transactions on Knowledge and Data Engineering | 2012

Feature Selection Based on Class-Dependent Densities for High-Dimensional Binary Data

Kashif Javed; Haroon Atique Babri; Mehreen Saeed

international symposium on neural networks | 2015

Design of the 2015 ChaLearn AutoML challenge

Isabelle Guyon; Kristin P. Bennett; Gavin C. Cawley; Hugo Jair Escalante; Sergio Escalera; Tin Kam Ho; Núria Macià; Bisakha Ray; Mehreen Saeed; Alexander R. Statnikov; Evelyne Viegas

Explore More