Anjana Kakoti Mahanta

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Anjana Kakoti Mahanta is active.

Explore More

Publication

Featured researches published by Anjana Kakoti Mahanta.

Pattern Recognition Letters | 2008

Finding calendar-based periodic patterns

Anjana Kakoti Mahanta; Fokrul Alom Mazarbhuiya; Hemanta K. Baruah

Mining patterns in a market-basket dataset is a well-stated problem. There are a number of approaches to deal with this problem. Different types of patterns may be present in a dataset. An interesting one is patterns that hold seasonally, which are called calendar-based patterns. Earlier methods require periods to be specified by the user. We present here a method which is able to extract different types of periodic patterns that may exist in a temporal market-basket dataset and it is not needed for the user to specify the periods in advance. We consider the time-stamps as a hierarchical data structure and then extract different types of patterns. The algorithm can detect both wholly and partially periodic patterns. Although we have applied our approach to market-basket dataset, the approach can be used for any event related dataset where the events are associated with time intervals.

international conference on data mining | 2005

CloseMiner: discovering frequent closed itemsets using frequent closed tidsets

Ningthoujam Gourakishwar Singh; Sanasam Ranbir Singh; Anjana Kakoti Mahanta

Complete set of itemsets can be grouped into non-overlapping clusters identified by closed tidsets. Each cluster has only one closed itemset and is the superset of all itemsets with the same support. Number of closed itemsets is identical to the number of clusters. Therefore, the problem of discovering closed itemsets can be considered as the problem of clustering the complete set of itemsets by closed tidsets. In this paper, we present CloseMiner, a new algorithm for discovering all frequent closed itemsets by grouping the complete set of itemsets into non-overlapping clusters identified by closed tidsets. An extensive experimental evaluation on a number of real and synthetic databases shows that CloseMiner outperforms Apriori and CHARM.

Journal of Experimental and Theoretical Artificial Intelligence | 2006

An algorithm for discovering the frequent closed itemsets in a large database

Ningthoujam Gourakishwar Singh; Sanasam Ranbir Singh; Anjana Kakoti Mahanta; Bhanu Prasad

Previous research revealed that the problem of discovering a complete set of frequent itemsets from a large database can be reduced to the problem of discovering the frequent closed itemsets, and this process results in a much smaller set of itemsets without information loss. This article is based on the observation that the set of all itemsets can be grouped into non-overlapping clusters such that each cluster is identified by a unique closed tidset. It is also found that there is only one closed itemset in each cluster and it is the superset of all itemsets with the same support. Therefore, the problem of discovering closed itemsets can be further considered as the problem of clustering the set of itemsets and then identifying each cluster by a unique closed tidset. This article presents CloseMiner, a new algorithm for discovering all frequent closed itemsets by grouping the set of itemsets into non-overlapping clusters. Experimental evaluation based on a number of real and synthetic databases has proved that CloseMiner outperforms the existing systems APRIORI and CHARM.

International Journal of Computer Applications | 2013

Categorical Data Clustering based on an Alternative Data Representation Technique

Jyoti Prokash Goswami; Anjana Kakoti Mahanta

Clustering categorical data is relatively difficult than clustering numeric data. In numeric data the inherent geometric properties can be used in defining distance functions between data points. In case of categorical data, a distance or dissimilarity function can’t be defined directly. An extension of the classical k-means algorithm for categorical data has been done in [1], where a method of representing a cluster using representatives which are very much similar to means used in k-means algorithm has been proposed together with a new distance measure. In this paper we first propose an alternative representation of categorical data as numeric data making it easier to handle. This technique provides a uniform representation for data points and the cluster representatives. The similarity measure proposed in [2] has been used in this new setting. The algorithm used in [1] has been implemented and tested with this new setting and the results obtained have been reported. Experiments were conducted on two real life data sets, namely, soybean diseases, and mushroom data sets. The clusters obtained in soybean dataset are pure clusters with hundred percent accuracy. In the other dataset also it gives relatively higher accuracy with small errors.

pattern recognition and machine intelligence | 2009

Mining Local Association Rules from Temporal Data Set

Fokrul Alom Mazarbhuiya; Muhammad Abulaish; Anjana Kakoti Mahanta; Tanvir Ahmad

In this paper, we present a novel approach for finding association rules from locally frequent itemsets using rough set and boolean reasoning. The rules mined so are termed as local association rules. The efficacy of the proposed approach is established through experiment over retail dataset that contains retail market basket data from an anonymous Belgian retail store.

pattern recognition and machine intelligence | 2009

Mining Calendar-Based Periodicities of Patterns in Temporal Data

Mala Dutta; Anjana Kakoti Mahanta

An efficient algorithm with a worst-case time complexity of O(n logn) is proposed for detecting seasonal (calendar-based) periodicities of patterns in temporal datasets. Hierarchical data structures are used for representing the timestamps associated with the data. This representation facilitates the detection of different types of seasonal periodicities viz. yearly periodicities, monthly periodicities, daily periodicities etc. of patterns in the temporal dataset. The algorithm is tested with real-life data and the results are given.

Archive | 2018

An Incremental Algorithm for Mining Closed Frequent Intervals

Irani Hazarika; Anjana Kakoti Mahanta

Interval data are found in many real-life situations involving attributes like distance, time, etc. Mining closed frequent intervals from such data may provide useful information. Previous methods for finding closed frequent intervals assume that the data is static. In practice, the data in a dynamic database changes over time, with intervals being added and deleted continuously. In this paper, we propose an incremental method to mine frequent intervals from an interval database with n records, where each record represents one interval. This method assumes that intervals are added one by one into the database and each time an interval is added to the database, our proposed method will mine all the newly generated closed frequent intervals in O(n) time.

international conference on computing communication control and automation | 2016

Design of a weighted meta classifier for imbalance data having heterogeneous features

Irani Hazarika; Rupam Chodhury; Anjana Kakoti Mahanta

This paper deals with the design of a weighted ensemble of classifiers to classify imbalance data having heterogeneous features. For this purpose, a meta ensemble model is created and instead of class labels, the output of each base classifier used in the ensemble model is transformed into a [class label, weight] pair to deal with the problem. The performances of the proposed method on various datasets are calculated using the measures - Classification accuracy and G-means. The results obtained from the proposed method are compared with other methods and the proposed one shows good results in most of the cases.

ieee international conference on electrical computer and communication technologies | 2015

Mining closed intervals in an interval database: An incremental method

Mala Dutta; Malay Dutta; Anjana Kakoti Mahanta

In this paper, an incremental method for mining the set of closed intervals in an interval database is presented. In fast-growing data, new intervals are added to an interval database over time. Some earlier methods for mining closed intervals in an interval database assumed the database to be static and hence such methods are not effective for databases whose sizes are incremented over time. Though an incremental method for mining closed intervals has been proposed earlier, the incremental method presented in this paper for the same problem is more time-efficient than the previous method. The method proposed in this paper takes only O(n) time to update the set of closed intervals in an interval database containing n intervals after a new interval is added to it, as compared to O(n2) time taken by the earlier incremental method. The method proposed in this paper has been tested on real-life and synthetic data and the results are reported.

ieee india conference | 2015

A genetic algorithm based ensemble approach for categorical data clustering

Jyoti Prokash Goswami; Anjana Kakoti Mahanta

In this paper, we propose a genetic algorithm based procedure to combine different clustering solutions obtained for the same data set to construct a relatively good solution. Clustering ensemble technique, alternatively also known as clustering aggregation or consensus clustering, considers the different individual solutions obtained and combines them into a single solution of better quality using a consensus function. In the genetic algorithm based consensus function proposed here, each cluster is represented using a single representative. A chromosome represents a set of cluster representatives for a particular clustering result. Single point crossover is made between the two cluster representatives of highest similarity value so that changes in the clusters of the selected pair of chromosomes are small i.e. the clustering solutions gradually converge to the optimal one. Mutation is performed in such a way that the properties of cluster representatives remain invariant. A new fitness measure is proposed to evaluate the fitness of the cluster representatives as well as of the individual chromosomes. Experiments are made on real life state-of the -art data sets and results are reported.

Explore More