Pankaj Kankar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Pankaj Kankar is active.

Explore More

Publication

Featured researches published by Pankaj Kankar.

IEEE Transactions on Knowledge and Data Engineering | 2005

Information retrieval and knowledge discovery utilizing a biomedical patent semantic Web

Sougata Mukherjea; Bhuvan Bamba; Pankaj Kankar

Before undertaking new biomedical research, identifying concepts that have already been patented is essential. A traditional keyword-based search on patent databases may not be sufficient to retrieve all the relevant information, especially for the biomedical domain. This paper presents BioPatentMiner, a system that facilitates information retrieval and knowledge discovery from biomedical patents. The system first identifies biological terms and relations from the patents and then integrates the information from the patents with knowledge from biomedical ontologies to create a semantic Web. Besides keyword search and queries linking the properties specified by one or more RDF triples, the system can discover semantic associations between the Web resources. The system also determines the importance of the resources to rank the results of a search and prevent information overload while determining the semantic associations.

conference on information and knowledge management | 2003

Information extraction from biomedical literature: methodology, evaluation and an application

L. Venkata Subramaniam; Sougata Mukherjea; Pankaj Kankar; Biplav Srivastava; Vishal S. Batra; Pasumarti V. Kamesam; Ravi Kothari

Journals and conference proceedings represent the dominant mechanisms of reporting new biomedical results. The unstructured nature of such publications makes it difficult to utilize data mining or automated knowledge discovery techniques. Annotation (or markup) of these unstructured documents represents the first step in making these documents machine analyzable. In this paper we first present a system called BioAnnotator for identifying and annotating biological terms in documents. BioAnnotator uses domain based dictionary look-up for recognizing known terms and a rule engine for discovering new terms. The combination and dictionary look-up and rules result in good performance (87% precision and 94% recall on the GENIA 1.1 corpus for extracting general biological terms based on an approximate matching criterion). To demonstrate the subsequent mining and knowledge discovery activities that are made feasible by BioAnnotator, we also present a system called MedSummarizer that uses the extracted terms to identify the common concepts in a given group of genes.

international conference on service oriented computing | 2005

Handling faults in decentralized orchestration of composite web services

Girish Chafle; Sunil Chandra; Pankaj Kankar; Vijay Mann

Composite web services can be orchestrated in a decentralized manner by breaking down the original service specification into a set of partitions and executing them on a distributed infrastructure. The infrastructure consists of multiple service engines communicating with each other over asynchronous messaging. Decentralized orchestration yields performance benefits by exploiting concurrency and reducing the data on the network. Further, decentralized orchestration may be necessary to orchestrate certain composite web services due to privacy and data flow constraints. However, decentralized orchestration also results in additional complexity due to absence of a centralized global state, and overlapping or different life cycles of the various partitions. This makes handling of faults arising from composite service partitions or from the failure of component web services, a challenging task. In this paper we propose a mechanism for handling faults in decentralized orchestration of composite web services. The mechanism includes a strategy for placement of fault handlers and compensation handlers, and schemes for fault propagation and fault recovery. The mechanism is designed to maintain the semantics of the original specification while ensuring minimal overheads.

component-based software engineering | 2005

Reusable dialog component framework for rapid voice application development

Rahul P. Akolkar; Tanveer A. Faruquie; Juan M. Huerta; Pankaj Kankar; Nitendra Rajput; Thiruvilwamalai V. Raman; Raghavendra Udupa; Abhishek Verma

Voice application development requires specialized speech related skills besides the general programming ability. Encapsulating the speech specific behavior and complexities in prepackaged, configurable User Interface (UI) components will ease and expedite the voice application development. These components can be used across applications and are called as Reusable Dialog Components (RDCs). In this paper we propose a programming model and the framework for developing reusable dialog components. Our framework facilitates the development of voice applications via the encapsulation of interaction mechanisms, the encapsulation of best-of-breed practices (ie. grammars, prompts, and configuration parameters), a modular design and through pluggable dialog management strategies. The framework extends the standard J2EE/JSP based programming model to make it suitable for voice applications.

conference on information and knowledge management | 2002

A system for knowledge management in bioinformatics

Sudeshna Adak; Vishal S. Batra; Deo N. Bhardwaj; Pasumarti V. Kamesam; Pankaj Kankar; Manish P. Kurhekar; Biplav Srivastava

The emerging biochip technology has made it possible to simultaneously study expression (activity level) of thousands of genes or proteins in a single experiment in the laboratory. However, in order to extract relevant biological knowledge from the biochip experimental data, it is critical not only to analyze the experimental data, but also to cross-reference and correlate these large volumes of data with information available in external biological databases accessible online. We address this problem in a comprehensive system for knowledge management in bioinformatics called e2e. To the biologist or biological applications, e2e exposes a common semantic view of inter-relationship among biological concepts in the form of an XML representation called eXpressML, while internally, it can use any data integration solution to retrieve data and return results corresponding to the semantic view. We have implemented an e2e prototype that enables a biologist to analyze her gene expression data in GEML or from a public site like Stanford, and discover knowledge through operations like querying on relevant annotated data represented in eXpressML using pathways data from KEGG, publication data from Medline and protein data from SWISS-PROT.

acm symposium on applied computing | 2005

Text-based summarization and visualization of gene clusters

Pankaj Kankar; Sougata Mukherjea

We present a system named MedSummarizer which uses biomedical literature information to assign biological meaning to a cluster of genes. Using relevant PubMed citations, it creates a ranked list of important biological concepts that describes the gene list. Further, based on the assigned concepts, it computes similarity between each pair of genes and displays this using a graph based visualization technique. The system allows use of human curated index (e.g. Mesh terms) as well as automatic annotations derived from free-text. We compare the results obtained using these two types of terms.

wireless and mobile computing, networking and communications | 2005

SAMVAAD: speech applications made viable for access-anywhere devices

Nitendra Rajput; Amit Anil Nanavati; Mohit Kumar; Pankaj Kankar; Rajan Dahiya

The proliferation of pervasive devices has stimulated the development of applications that support ubiquitous access via multiple modalities. Since the processing capabilities of pervasive devices differ vastly, device-specific application adaptation becomes essential. We address the problem of speech application adaptation by dialog call-flow reorganisation for pervasive devices with different memory constraints. Given an atomic dialog call-flow A and device memory size m, we present optimal deterministic algorithms, RESEQUENCE and BALANCE-TREE, which minimise the number of questions in the reorganised output call-flow A/sub m/. Algorithms MASQ and MATREE produce C/sub m/, minimally distant from input call-flow A/sub m/ while accommodating the memory constraint m. These two minimisation criteria are capable of capturing various usability requirements important in dialog call-flow design. The following observation forms the cornerstone of all the algorithms in this paper: Two grammars g/sub 1/ and g/sub 2/ comprising of |g/sub 1/| and |g/sub 2/| elements respectively can be merged into a single grammar g = g/sub 1/ /spl times/ g/sub 2/ having |g/sub 1/|/spl middot/|g/sub 2/| elements for the sequential case, and g = g/sub 1/ + g/sub 2/ having |g/sub 1/|+|g/sub 2/| elements for the tree case. Device-speciific considerations lead us to introduce the concept of an -characterisation of a call-flow, defined as the set of pairs {(m/sub i/,q/sub i/)| /spl isin/ N}, where q/sub i/ is the minimum number of questions required for memory size m/sub i/. Each call-flow has a unique, device-independent signature in its -characterisation - a measure of its adaptability. We present SAMVAAD, a system that implements these algorithms on call-flows authored in VXML containing SRGS grammars. The system was tested on an IBM voice browser using a sample airline reservation system call-flow reorganised for memories ranging from 64 MB to 210 KB. We ran an experiment with 14 users to obtain feedback on the usability of the adapted call-flows.

ieee automatic speech recognition and understanding workshop | 2005

Reusable dialog component for content selection from large data sets

Sandeep Jindal; Pankaj Kankar; T.A. Faruquie

Inherent limitations of spoken language interfaces make the task of information access from large data sets difficult. Providing a dialog component which can be easily configured to access information from such data sets is immensely useful. Such component would ease and expedite the development of speech applications. We propose a dialog component which makes use of user preferences, user profile and utterance history to select relevant information from large data sets. Content presentation is also determined by user preferences and utterance history. The evaluation shows the effectiveness of the technique and effect of user profile in accessing information. It also demonstrates reusability of component to access different datasets

Archive | 2006