Anupam Bhattacharjee
Wayne State University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anupam Bhattacharjee.
information reuse and integration | 2009
Anupam Bhattacharjee; Hasan M. Jamil
Traditional schema matchers use a set of distinct simple matchers and use a composition function to combine the individual scores using an arbitrary order of matcher application leading to non-intuitive scores, produce improper matches, and wasteful and counterproductive computation, especially when no consideration is given to the properties of the individual matchers and the context of the application. In this paper, we propose a new method for schema matching in which wasteful computation is avoided by a prudent, and objective selection and ordering of a subset of useful matchers. This method thus has the potential to improve the matching efficiency and accuracy of many popular ontology generation engines. Such efficiency and quality assurance are imperative in autonomous systems because users rarely have a chance to validate the processing accuracy until the computation is complete. Experimental results to support the claim that such an approach monotonically improves the matching score at successive application of the matchers are also provided.
database and expert systems applications | 2009
Anupam Bhattacharjee; Aminul Islam; Mohammad Shafkat Amin; Shahriyar Hossain; Shazzad Hosain; Hasan M. Jamil; Leonard Lipovich
Data intensive applications in Life Sciences extensively use the hidden web as a platform for information sharing. Access to these heterogeneous hidden web resources is limited through the use of predefined web forms and interactive interfaces that users navigate manually, and assume responsibility for reconciling schema heterogeneity, extracting information and piping, transforming formats and so on in order to implement desired query sequences or scientific work flows. In this paper, we present a new data management system, called LifeDB , in which we offer support for currency without view materialization, and autonomous reconciliation of schema heterogeneity in one single platform through a declarative query language called BioFlow . In our approach, schema heterogeneity is resolved at run time by treating the hidden web resources as a virtual warehouses, and by supporting a set of primitives for data integration on-the-fly, extracting information and piping to other resources, and manipulating data in a way similar to traditional database systems to respond to application demands.
intelligent information systems | 2012
Anupam Bhattacharjee; Hasan M. Jamil
Given an undirected/directed large weighted data graph and a similar smaller weighted pattern graph, the problem of weighted subgraph matching is to find a mapping of the nodes in the pattern graph to a subset of nodes in the data graph such that the sum of edge weight differences is minimum. Biological interaction networks such as protein-protein interaction networks and molecular pathways are often modeled as weighted graphs in order to account for the high false positive rate occurring intrinsically during the detection process of the interactions. Nonetheless, complex biological problems such as disease gene prioritization and conserved phylogenetic tree construction largely depend on the similarity calculation among the networks. Although several existing methods provide efficient methods for graph and subgraph similarity measurement, they produce nonintuitive results due to the underlying unweighted graph model assumption. Moreover, very few algorithms exist for weighted graph matching that are applicable with the restriction that the data and pattern graph sizes are equal. In this paper, we introduce a novel algorithm for weighted subgraph matching which can effectively be applied to directed/undirected weighted subgraph matching. Experimental results demonstrate the superiority and relative scalability of the algorithm over available state of the art methods.
data mining in bioinformatics | 2014
Kazi Zakia Sultana; Anupam Bhattacharjee; Hasan M. Jamil
Understanding the interaction patterns among biological entities in a pathway can potentially reveal the role of the entities in biological systems. Although considerable effort has been contributed to this direction, querying biological pathways remained relatively unexplored. Querying is principally different in which we retrieve pathways satisfying a given property in terms of its topology, or constituents. One such property is subnetwork matching using various constituent parameters. In this paper, we introduce a logic based framework for querying biological pathways using a novel and generic subgraph isomorphism computation technique. We develop a graphical interface called IsoKEGG to facilitate flexible querying of KEGG pathways based on isomorphic pathway topologies as well as matching any combination of node names, types, and edges. It allows editing KGML represented query pathways and returns all isomorphic patterns in KEGG pathways satisfying a given query condition for further analysis.
acm symposium on applied computing | 2010
Mohammad Shafkat Amin; Anupam Bhattacharjee; Russell L. Finley; Hasan M. Jamil
Experimental methods are beginning to define the networks of interacting genes and proteins that control most biological processes. There is significant interest in developing computational approaches to identify subnetworks that control specific processes or that may be involved in specific human diseases. Because genes associated with a particular disease (i.e., disease genes) are likely to be well connected within the interaction network, the challenge is to identify the most well-connected subnetworks from a large number of possible subnetworks. One way to do this is to search through chromosomal loci, each of which has many candidate disease genes, to find a subset of genes well connected in the interaction network. In order to identify a significantly connected subnetwork, however, an efficient method of selecting candidate genes from each locus needs to be addressed. In the current study, we describe a method to extract important candidate subnetworks from a set of loci, each containing numerous genes. The method is scalable with the size of the interaction networks. We have conducted simulations with our method and observed promising performance.
acm symposium on applied computing | 2013
Anupam Bhattacharjee; Hasan M. Jamil
In this paper, we present an improved and novel directed graph matching algorithm, called CodeBlast, for searching functionally similar program segments in software repositories with greater effectiveness and accuracy. CodeBlast uses a novel canonical labeling concept to capture order independent data flow pattern in a program to encode the programŠs functional semantics and to aid matching. CodeBlast is capable of exact and approximate directed graph matching and is particularly suitable for matching Program Dependence Graphs. Introducing the notion of semantic equivalence in CodeBlast helps discover clone matches with high precision and recall that was not possible using systems such as JPlag, MOSS, and GPlag. We substantiate our claim through sufficient experimental evidence and comparative analysis with these leading systems.
bioinformatics and biomedicine | 2010
Kazi Zakia Sultana; Anupam Bhattacharjee; Hasan M. Jamil
Understanding the interaction patterns among a set of biological entities in a pathway is an important exercise because it potentially could reveal the role of the entities in biological systems. Although a considerable amount of effort has been directed to the detection and mining of patterns in biological pathways in contemporary research, querying biological pathways remained relatively unexplored. Querying is principally different in which we retrieve pathways that satisfy a given property in terms of its topology, or constituents. One such property is subnetwork matching using various constituent parameters. In this paper, we introduce a logic based framework for querying biological pathways based on a novel and generic subgraph isomorphism computation technique. We cast this technique into a graphical interface called IsoKEGG to facilitate flexible querying of KEGG pathways. We demonstrate that IsoKEGG is flexible enough to allow querying based on isomorphic pathway topologies as well as matching any combination of node names, types, and edges. It also allows editing KGML represented query pathways and returns all possible pathways in KEGG that satisfy a given query condition that the users are able to investigate further.
acm symposium on applied computing | 2010
Mohammad Shafkat Amin; Anupam Bhattacharjee; Hasan M. Jamil
As the volume of information available on the internet is growing exponentially, it is clear that most of this information will have to be processed and digested by computers to produce useful information for human consumption. Unfortunately, most web contents are currently designed for direct human consumption in which it is assumed that a human will decipher the information presented to him in some context and will be able to connect the missing dots, if any. In particular, information presented in some tabular form often does not accompany descriptive titles or column names similar to attribute names in tables. While such omissions are not really an issue for humans, it is truly hard to extract information in autonomous systems in which a machine is expected to understand the meaning of the table presented and extract the right information in the context of the query. It is even more difficult when the information needed is distributed across the globe and involve semantic heterogeneity. In this paper, our goal is to address the issue of how to interpret tables with missing column names by developing a method for the assignment of attributes names in an arbitrary table extracted from the web in a fully autonomous manner. We propose a novel approach by leveraging Wikipedia for the first time for column name discovery for the purpose of table annotation. We show that this leads to an improved likelihood of capturing the context and interpretation of the table accurately and producing a semantically meaningful query response.
flexible query answering systems | 2009
Kazi Zakia Sultana; Anupam Bhattacharjee; Mohammad Shafkat Amin; Hasan M. Jamil
In computer based internet services, queries are usually submitted in a context. Either the contexts are created, or are assumed - e.g., a purchase order, or an airline reservation. Unfortunately, there is little theoretical foundation for contexts, and systems usually do not use them formally. In this paper, we propose a model for context representation in the direction of aspect oriented programming and object-oriented systems, and show that contexts can be used to process queries better. We outline a brief model that we are pursuing based on the idea of constraint inheritance with exceptions in a query tree.
acm symposium on applied computing | 2010
Mohammad Shafkat Amin; Anupam Bhattacharjee; Hasan M. Jamil
Study of interactomes requires assembling complex tools, ontologies and online interaction network databases and so on to validate hypotheses and gain insight. One of the major bottlenecks is the discovery of similar or isomorphic subgraphs in very large interactomes and cross referencing the relationships a set of proteins or genes share. These interactomes are so large that most traditional subgraph isomorphism computation tools are unable to handle efficiently as stand alone tool, or as part of systems such as R. In this paper, we present a Cytoscape plugin to compute and discover isomorphic subnetworks in large interactomes based on a novel and efficient isomorphic subgraph computation method developed in our laboratory. Given an input interactome and a given query subnetwork, the plugin can efficiently compute interactome subnetworks similar to the query network, and cross reference the results from GO or other interactome databases with the aid of other available Cytoscape plugins such as BinGO. We describe the tool with respect to real life applications Biologists may want to contemplate.