Hasin Afzal Ahmed | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hasin Afzal Ahmed is active.

Explore More

Publication

Featured researches published by Hasin Afzal Ahmed.

2011 2nd National Conference on Emerging Trends and Applications in Computer Science | 2011

Triclustering in gene expression data analysis: A selected survey

Priyakshi Mahanta; Hasin Afzal Ahmed; Dhruba K. Bhattacharyya; Jugal K. Kalita

Mining microarray data sets is important in bioinformatics research and biomedical applications. Recently, mining triclusters or 3D clusters in a Gene Sample Time or 3D microarray data is an emerging area of research. Each tricluster contains a subset of genes and a subset of samples such that the genes are coherent on the samples along the time series. There is a scarcity of triclustering algorithms in the literature of microarray data analysis. We review some existing triclustering algorithms and discuss their merits and demerits. Finally we are trying to provide the researcher who are new to this field a base platform by exposing the issues which are still challenging in triclustering through our analysis of these algorithms.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2014

Shifting-and-scaling correlation based biclustering algorithm

Hasin Afzal Ahmed; Priyakshi Mahanta; Dhruba K. Bhattacharyya; Jugal K. Kalita

The existence of various types of correlations among the expressions of a group of biologically significant genes poses challenges in developing effective methods of gene expression data analysis. The initial focus of computational biologists was to work with only absolute and shifting correlations. However, researchers have found that the ability to handle shifting-and-scaling correlation enables them to extract more biologically relevant and interesting patterns from gene microarray data. In this paper, we introduce an effective shifting-and-scaling correlation measure named Shifting and Scaling Similarity (SSSim), which can detect highly correlated gene pairs in any gene expression data. We also introduce a technique named Intensive Correlation Search (ICS) biclustering algorithm, which uses SSSim to extract biologically significant biclusters from a gene expression data set. The technique performs satisfactorily with a number of benchmarked gene expression data sets when evaluated in terms of functional categories in Gene Ontology database.

BMC Bioinformatics | 2012

An effective method for network module extraction from microarray data

Priyakshi Mahanta; Hasin Afzal Ahmed; Dhruba K. Bhattacharyya; Jugal K. Kalita

BackgroundThe development of high-throughput Microarray technologies has provided various opportunities to systematically characterize diverse types of computational biological networks. Co-expression network have become popular in the analysis of microarray data, such as for detecting functional gene modules.ResultsThis paper presents a method to build a co-expression network (CEN) and to detect network modules from the built network. We use an effective gene expression similarity measure called NMRS (Normalized mean residue similarity) to construct the CEN. We have tested our method on five publicly available benchmark microarray datasets. The network modules extracted by our algorithm have been biologically validated in terms of Q value and p value.ConclusionsOur results show that the technique is capable of detecting biologically significant network modules from the co-expression network. Biologist can use this technique to find groups of genes with similar functionality based on their expression information.

bioinformatics and bioengineering | 2011

GERC: Tree Based Clustering for Gene Expression Data

Hasin Afzal Ahmed; Priyakshi Mahanta; Dhruba K. Bhattacharyya; Jugal K. Kalita

Measurement of gene expression using DNA micro arrays have revolutionized biological and medical research. This paper presents a divisive clustering algorithm that produces a tree of genes called GERC tree along with the generated clusters. Unlike a dendrogram, a GERC tree is a general tree and it is an ample resource for biological information about the genes in a data set. The leaves of the tree represent the desired clusters. The clustering method was tested with several real-life data sets and the proposed method has been found satisfactory.

world congress on information and communication technologies | 2011

Intersected coexpressed subcube miner: An effective triclustering algorithm

Hasin Afzal Ahmed; Priyakshi Mahanta; Dhruba K. Bhattacharyya; Jugal K. Kalita; Ashish Ghosh

Triclustering techniques extract genes that have similar expression patterns in a set of samples across a set of time points. A challenge in triclustering is to account for both inter-temporal and intra-temporal gene coherence. Other challenges include avoidance of time-dominated and sample-dominated results and detection of time latent triclusters. This paper presents a technique based on order preserving sub-matrices to find a set of triclusters from gene-sample-time data. The technique finds a set of initial modules in each unordered pair of gene-sample planes, which are then extended to final triclusters. We propose a planar similarity measure called PMRS to extend the initial modules to the final triclusters.

Network Modeling Analysis in Health Informatics and BioInformatics | 2016

Big data analytics in bioinformatics: architectures, techniques, tools and issues

Hirak Kashyap; Hasin Afzal Ahmed; Nazrul Hoque; Swarup Roy; Dhruba K. Bhattacharyya

Bioinformatics research is characterized by voluminous and incremental datasets and complex data analytics methods. The machine learning methods used in bioinformatics are iterative and parallel. These methods can be scaled to handle big data using the distributed and parallel computing technologies. Usually big data tools perform computation in batch mode and are not optimized for iterative processing and high data dependency among operations. In the recent years, parallel, incremental, and multi-view machine learning algorithms have been proposed. Similarly, graph-based architectures and in-memory big data tools have been developed to minimize I/O cost and optimize iterative processing. However, standard big data architectures are still lacking. Also appropriate tools are not available for many important bioinformatics problems, such as fast construction of co-expression and regulatory networks and salient module identification, detection of complexes over growing protein-protein interaction data, fast analysis of massive DNA, RNA, and protein sequence data, and fast querying on incremental and heterogeneous disease networks. This paper addresses the issues and challenges posed by several big data problems in bioinformatics, and gives an overview of the state of the art and the future research opportunities.

Network Modeling Analysis in Health Informatics and BioInformatics | 2015

Unsupervised methods for finding protein complexes from PPI networks

Pooja Sharma; Hasin Afzal Ahmed; Swarup Roy; Dhruba K. Bhattacharyya

Complex biological systems are often represented as networks and studied computationally. In protein–protein interaction networks, interactions give rise to certain compounds known as protein complexes. Identifying functional protein complexes is an emerging field of study in system biology. Several machine learning methods have been proposed so far to detect functionally enriched protein complexes responsible for specific biological functions or diseases. We present an empirical study on the different unsupervised approaches towards the identification of such complexes. We report performance of seven popularly known methods against four benchmark datasets in terms of six evaluation parameters. We also highlight some issues and challenges prevailing in this field of research.

computational science and engineering | 2012

Discretization in gene expression data analysis: a selected survey

Priyakshi Mahanta; Hasin Afzal Ahmed; Jugal K. Kalita; Dhruba K. Bhattacharyya

Discretization techniques are widely used as preprocessing task in different classification techniques specially in the area of machine learning. These techniques have also been used as a preprocessing task for computational construction of regulatory networks in gene expression data analysis. We analyze the use of some widely used discretization techniques in other gene expression data analysis tasks such as gene functional prediction. This paper evaluates the performance of these discretization techniques as a preprocessing task by applying the discretized gene expression data on different clustering algorithms. The results generated by the clustering algorithms are internally and externally validated against different discretization techniques. Finally, we introduce some of the important issues and research challenges.

Computational Biology and Chemistry | 2015

Core and peripheral connectivity based cluster analysis over PPI network

Hasin Afzal Ahmed; Dhruba K. Bhattacharyya; Jugal K. Kalita

A number of methods have been proposed in the literature of protein-protein interaction (PPI) network analysis for detection of clusters in the network. Clusters are identified by these methods using various graph theoretic criteria. Most of these methods have been found time consuming due to involvement of preprocessing and post processing tasks. In addition, they do not achieve high precision and recall consistently and simultaneously. Moreover, the existing methods do not employ the idea of core-periphery structural pattern of protein complexes effectively to extract clusters. In this paper, we introduce a clustering method named CPCA based on a recent observation by researchers that a protein complex in a PPI network is arranged as a relatively dense core region and additional proteins weakly connected to the core. CPCA uses two connectivity criterion functions to identify core and peripheral regions of the cluster. To locate initial node of a cluster we introduce a measure called DNQ (Degree based Neighborhood Qualification) index that evaluates tendency of the node to be part of a cluster. CPCA performs well when compared with well-known counterparts. Along with protein complex gold standards, a co-localization dataset has also been used for validation of the results.

Network Modeling Analysis in Health Informatics and BioInformatics | 2014

A statistical feature selection technique

Pallabi Borah; Hasin Afzal Ahmed; Dhruba K. Bhattacharyya

Various feature selection techniques have been proposed in the field of machine learning. The filter approaches are typically faster while wrapper approaches are more reliable though computationally expensive. Feature selection techniques often strive to achieve performance similar to wrapper approaches employing various computational approaches. Feature selection techniques typically depend on ways how they compute feature–feature correlation and feature–class correlation. These two computations are highly governed by the correlation measure being used. In this work, a method is developed named enhanced correlation-based feature selection (ECFS) to effectively employ the feature–feature and feature–class correlations to extract relevant feature subset from multi-class gene expression data as well as machine learning datasets. The performance of ECFS in terms of classification accuracies obtained by decision tree, random forest and KNN classifiers has been found highly satisfactory over several benchmark datasets.

Explore More