Aditya Kumar Sehgal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Aditya Kumar Sehgal is active.

Explore More

Publication

Featured researches published by Aditya Kumar Sehgal.

BMC Bioinformatics | 2006

Retrieval with gene queries

Aditya Kumar Sehgal; Padmini Srinivasan

BackgroundAccuracy of document retrieval from MEDLINE for gene queries is crucially important for many applications in bioinformatics. We explore five information retrieval-based methods to rank documents retrieved by PubMed gene queries for the human genome. The aim is to rank relevant documents higher in the retrieved list. We address the special challenges faced due to ambiguity in gene nomenclature: gene terms that refer to multiple genes, gene terms that are also English words, and gene terms that have other biological meanings.ResultsOur two baseline ranking strategies are quite similar in performance. Two of our three LocusLink-based strategies offer significant improvements. These methods work very well even when there is ambiguity in the gene terms. Our best ranking strategy offers significant improvements on three different kinds of ambiguities over our two baseline strategies (improvements range from 15.9% to 17.7% and 11.7% to 13.3% depending on the baseline). For most genes the best ranking query is one that is built from the LocusLink (now Entrez Gene) summary and product information along with the gene names and aliases. For others, the gene names and aliases suffice. We also present an approach that successfully predicts, for a given gene, which of these two ranking queries is more appropriate.ConclusionWe explore the effect of different post-retrieval strategies on the ranking of documents returned by PubMed for human gene queries. We have successfully applied some of these strategies to improve the ranking of relevant documents in the retrieved sets. This holds true even when various kinds of ambiguity are encountered. We feel that it would be very useful to apply strategies like ours on PubMed search results as these are not ordered by relevance in any way. This is especially so for queries that retrieve a large number of documents.

web information and data management | 2006

Ranking target objects of navigational queries

Louiqa Raschid; Yao Wu; Woei-Jyh Lee; Maria-Esther Vidal; Panayiotis Tsaparas; Padmini Srinivasan; Aditya Kumar Sehgal

Web navigation plays an important role in exploring public interconnected data sources such as life science data. A navigational query in the life science graph produces a result graph which is a layered directed acyclic graph (DAG). Traversing the result paths in this graph reaches a target object set (TOS). The challenge for ranking the target objects is to provide recommendations that re ect the relative importance of the retrieved object, as well as its relevance to the specific query posed by the scientist. We present a metric layered graph PageRank (lgPR) to rank target objects based on the link structure of the result graph. LgPR is a modification of PageRank; it avoids random jumps to respect the path structure of the result graph. We also outline a metric layered graph ObjectRank (lgOR) which extends the metric ObjectRank to layered graphs. We then present an initial evaluation of lgPR. We perform experiments on a real-world graph of life sciences objects from NCBI and report on the ranking distribution produced by lgPR. We compare lgPR with PageRank. In order to understand the characteristics of lgPR, an expert compared the Top K target objects (publications in the PubMed source) produced by lgPR and a word-based ranking method that uses text features extracted from an external source (such as Entrez Gene) to rank publications.

Archive | 2008

Analyzing LBD Methods using a General Framework

Aditya Kumar Sehgal; Xin Ying Qiu; Padmini Srinivasan

This chapter provides a birds-eye view of the methods used for literature-based discovery (LBD). We study these methods with the help of a simple framework that emphasizes objects, links, inference methods, and additional knowledge sources. We consider methods from a domain independent perspective. Specifically, we review LBD research on postulating gene —disease connections, LBD systems designed for general purpose biomedical discovery goals, as well as LBD research applied to the web. Opportunities for new methods, gaps in our knowledge, and critical differences between methods are recognized when the “literature on LBD” is viewed through the scope of our framework. The main contributions of this chapter are in presenting open problems in LBD and outlining avenues for further research.

international acm sigir conference on research and development in information retrieval | 2005

Manjal: a text mining system for MEDLINE

Aditya Kumar Sehgal; Padmini Srinivasan

Text mining can be described as the extraction of novel information from text sources. By novel we mean that the extracted information is not explicitly present in the text being analyzed. Text mining systems are generally used to generate hypotheses that are then verified by domain experts. An example of this approach would be to find all novel relationships between a disease and a drug. This approach is useful given the vast amount of information available today. Systems based on this approach, such as ARROWSMITH have been successfully used to discover novel hypotheses that have later been verified. In this demonstration we describe Manjal, a text mining system designed to help individuals explore one or more topics using the MEDLINE database. MEDLINE is an index to biomedical research that is maintained by the National Library of Medicine (NLM). Typically each MEDLINE record has a title, an abstract, descriptive phrases called MeSH (Medical Subject Headings) terms and several other fields. Manjal accesses MEDLINE automatically through NLM’s web-based interface, PubMed. Manjal considers a topic as its unit of analysis. A topic is defined as a MEDLINE query that is supported by the

Archive | 2004