Is this you? Create Your Porfile

Lipika Dey

Indian Institute of Technology Delhi

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lipika Dey is active.

Explore More

Publication

Featured researches published by Lipika Dey.

data and knowledge engineering | 2007

A k-mean clustering algorithm for mixed numeric and categorical data

Amir Ahmad; Lipika Dey

Use of traditional k-mean type algorithm is limited to numeric data. This paper presents a clustering algorithm based on k-mean paradigm that works well for data with mixed numeric and categorical features. We propose new cost function and distance measure based on co-occurrence of values. The measures also take into account the significance of an attribute towards the clustering process. We present a modified description of cluster center to overcome the numeric data only limitation of k-mean algorithm and provide a better characterization of clusters. The performance of this algorithm has been studied on real world data sets. Comparisons with other clustering algorithms illustrate the effectiveness of this approach.

Sadhana-academy Proceedings in Engineering Sciences | 2002

Devnagari numeral recognition by combining decision of multiple connectionist classifiers

Reena Bajaj; Lipika Dey; Santanu Chaudhury

This paper is concerned with recognition of handwritten Devnagari numerals. The basic objective of the present work is to provide an efficient and reliable technique for recognition of handwritten numerals. Three different types of features have been used for classification of numerals. A multi-classifier connectionist architecture has been proposed for increasing reliability of the recognition results. Experimental results show that the technique is effective and reliable.

Pattern Recognition Letters | 2005

A feature selection technique for classificatory analysis

Amir Ahmad; Lipika Dey

Abstract Patterns summarizing mutual associations between class decisions and attribute values in a pre-classified database, provide insight into the significance of attributes and also useful classificatory knowledge. In this paper we have proposed a conditional probability based, efficient method to extract the significant attributes from a database. Reducing the feature set during pre-processing enhances the quality of knowledge extracted and also increases the speed of computation. Our method supports easy visualization of classificatory knowledge. A likelihood-based classification algorithm that uses this classificatory knowledge is also proposed. We have also shown how the classification methodology can be used for cost-sensitive learning where both accuracy and precision of prediction are important.

Pattern Recognition Letters | 2007

A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set

Amir Ahmad; Lipika Dey

Computation of similarity between categorical data objects in unsupervised learning is an important data mining problem. We propose a method to compute distance between two attribute values of same attribute for unsupervised learning. This approach is based on the fact that similarity of two attribute values is dependent on their relationship with other attributes. Computational cost of this method is linear with respect to number of data objects in data set. To see the effectiveness of our proposed distance measure, we use proposed distance measure with K-mode clustering algorithm to cluster various categorical data sets. Significant improvement in clustering accuracy is observed as compared to clustering results obtained using traditional K-mode clustering algorithm.

international conference on computing theory and applications | 2007

A Fuzzy Ontology Generation Framework for Handling Uncertainties and Nonuniformity in Domain Knowledge Description

Muhammad Abulaish; Lipika Dey

Since Web documents are not fully structured sources of information and in Internet almost everything, especially in the realm of search, is approximate in nature, it is not possible to utilize the benefits of a domain ontology straight away to extract information from such a document. One way of overcoming this problem is the postulation of a fuzzy ontology by adding a value for degree of membership to each term that is imprecise in nature. In this paper, we propose a fuzzy ontology generation framework in which a concept descriptor is represented as a fuzzy relation which encodes the degree of a property value using a fuzzy membership function. The fuzzy ontology framework provides appropriate support for application integration by identifying the most likely location of a particular term in the ontology. The applicability of the fuzzy ontology structure in retrieving and curating information from text documents to answer imprecise queries has been thoroughly experimented

web intelligence | 2006

Interoperability among Distributed Overlapping Ontologies--A Fuzzy Ontology Framework

Muhammad Abulaish; Lipika Dey

Ontologies are proposed as a means for knowledge sharing among applications but, it is often not possible to converge to a single unambiguous ontology that is acceptable to all knowledge engineers. Different ontologies vary greatly in terms of the level of detail of their representations, as well as the nature of their underlying logical specifications. Interoperability among different ontologies becomes essential to gain from the power of the existing domain ontologies. In this paper we have proposed a fuzzy ontology framework in which a concept descriptor is represented as a fuzzy relation which encodes the degree of a property value using a fuzzy membership function. Other than concept descriptors, the semantic relations in the ontology like IS-A, HAS-PART etc. are also associated a strength of association. The strength of association between two concepts determines the uniformity with which these two concepts have been defined identically across different ontologies. The fuzzy ontology framework provides appropriate support for application integration by identifying the most likely location of a particular term in the ontology

atlantic web intelligence conference | 2005

Ontology aided query expansion for retrieving relevant texts

Lipika Dey; Shailendra Singh; Romi Rai; Saurabh Gupta

Knowledge based approaches to text information retrieval are aimed at increasing the precision of retrieval. In this paper we show that query enhancement through the use of domain ontological structures can enhance the quality of retrieval to a large extent. We have presented a formal framework for extending user queries with domain ontological structures. The query-expansion mechanism has been implemented as a client-side query processor which can use any efficient search engine like Google or Alta Vista at the back end. The approach offers substantial performance gains. We have established the effectiveness of the approach experimentally through the use of single and multiple ontologies.

Information Processing and Management | 2005

A rough-fuzzy document grading system for customized text information retrieval

S. P. Singh; Lipika Dey

Due to the large repository of documents available on the web, users are usually inundated by a large volume of information, most of which is found to be irrelevant. Since user perspectives vary, a client-side text filtering system that learns the users perspective can reduce the problem of irrelevant retrieval. In this paper, we have provided the design of a customized text information filtering system which learns user preferences and modifies the initial query to fetch better documents. It uses a rough-fuzzy reasoning scheme. The rough-set based reasoning takes care of natural language nuances, like synonym handling, very elegantly. The fuzzy decider provides qualitative grading to the documents for the users perusal. We have provided the detailed design of the various modules and some results related to the performance analysis of the system.

web intelligence | 2005

Biological Ontology Enhancement with Fuzzy Relations: A Text-Mining Framework

Muhammad Abulaish; Lipika Dey

Domain ontology can help in information retrieval from documents. But ontology is a pre-defined structure with crisp concept descriptions and inter-concept relations. However, due to the dynamic nature of the document repository, ontology should be upgradeable with information extracted through text mining of documents in the domain. This also necessitates that concepts, their descriptions and inter-concept relations should be associated with a degree of fuzziness that will indicate the support for the extracted knowledge according to the currently available resources. Supports may be revised with more knowledge coming in future. This approach preserves the basic structured knowledge format for storing domain knowledge, but at the same time allows for update of information. In this paper, we have proposed a mechanism which initiates text mining with a set of ontological concepts, and thereafter extracts fuzzy relations through text mining. Membership values of relations are functions of frequency of co-occurrence of concepts and relations. We have worked on the GENIA corpus and shown how fuzzy relations can be further used for guided information extraction from MEDLINE documents.

Applied Soft Computing | 2005

A new customized document categorization scheme using rough membership

Shailendra Singh; Lipika Dey

One of the problems that plague document ranking is the inherent ambiguity, which arises due to the nuances of natural language. Though two documents may contain the same set of words, their relevance may be very different to a single user, since the context of the words usually determines the relevance of a document. Context of a document is very difficult to model mathematically other than through user preferences. Since it is difficult to perceive all possible user interests a priori and install filters for the same at the server side, we propose a rough-set-based document filtering scheme which can be used to build customized filters at the user end. The documents retrieved by a traditional search engine can then be filtered automatically by this agent and the user is not flooded with a lot of irrelevant material. A rough-set-based classificatory analysis is used to learn the users bias for a category of documents. This is then used to filter out irrelevant documents for the user. To do this we have proposed the use of novel rough membership functions for computing the membership of a document to various categories.

Explore More