Is this you? Create Your Porfile

Soumya Sen

Information Technology University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Soumya Sen is active.

Explore More

Publication

Featured researches published by Soumya Sen.

international conference on emerging applications of information technology | 2011

A Framework to Convert XML Schema to ROLAP

Sarbani Dasgupta; Soumya Sen; Nabendu Chaki

Data warehouse provides architecture and tools for business executive to systematically organize, understand and use their data to make strategic decision. On the other hand, XML is used for e-commerce and Internet based application. Since many organizations use web for their business purpose, researches has been done for integrating XML data into data warehouse. This paper illustrates an approach for integrating XML data, modeled by XML schema to its compatible data warehouse based on relational online analytical processing (ROLAP). In this paper, this modeling is performed in three steps. Firstly, a schema tree is constructed from the given XML schema. In the next step, an entity-relationship diagram (ER Diagram) is derived from this schema tree. At the end, dimension tables as well as the corresponding fact table have been identified from the ER Diagram to provide a suitable multidimensional data model for online analytical processing (OLAP). The relationship between the fact tables and dimension tables have been organized in the form of star schema or snowflake schema as required.

international conference on computer sciences and convergence information technology | 2009

Optimal Space and Time Complexity Analysis on the Lattice of Cuboids Using Galois Connections for Data Warehousing

Soumya Sen; Nabendu Chaki; Agostino Cortesi

In this paper, an optimal aggregation and counter-aggregation (drill-down) methodology is proposed on multidimensional data cube. The main idea is to aggregate on smaller cuboids after partitioning those depending on the cardinality of the individual dimensions. Based on the operations to make these partitions, a Galois Connection is identified for formal analysis that allow to guarantee the soundness of optimizations of storage space and time complexity for the abstraction and concretization functions defined on the lattice structure. Our contribution can be seen as an application to OLAP operations on multidimensional data model in the Abstract Interpretation framework.

Medical & Biological Engineering & Computing | 2018

Clinical application of modified bag-of-features coupled with hybrid neural-based classifier in dengue fever classification using gene expression data

Sankhadeep Chatterjee; Nilanjan Dey; Fuqian Shi; Amira S. Ashour; Simon Fong; Soumya Sen

Dengue fever detection and classification have a vital role due to the recent outbreaks of different kinds of dengue fever. Recently, the advancement in the microarray technology can be employed for such classification process. Several studies have established that the gene selection phase takes a significant role in the classifier performance. Subsequently, the current study focused on detecting two different variations, namely, dengue fever (DF) and dengue hemorrhagic fever (DHF). A modified bag-of-features method has been proposed to select the most promising genes in the classification process. Afterward, a modified cuckoo search optimization algorithm has been engaged to support the artificial neural (ANN-MCS) to classify the unknown subjects into three different classes namely, DF, DHF, and another class containing convalescent and normal cases. The proposed method has been compared with other three well-known classifiers, namely, multilayer perceptron feed-forward network (MLP-FFN), artificial neural network (ANN) trained with cuckoo search (ANN-CS), and ANN trained with PSO (ANN-PSO). Experiments have been carried out with different number of clusters for the initial bag-of-features-based feature selection phase. After obtaining the reduced dataset, the hybrid ANN-MCS model has been employed for the classification process. The results have been compared in terms of the confusion matrix-based performance measuring metrics. The experimental results indicated a highly statistically significant improvement with the proposed classifier over the traditional ANN-CS model.

2017 1st International Conference on Electronics, Materials Engineering and Nano-Technology (IEMENTech) | 2017

Cuckoo search coupled artificial neural network in detection of chronic kidney disease

Sankhadeep Chatterjee; Soumen Banerjee; Pikorab Basu; Mainak Debnath; Soumya Sen

In the present work a Cuckoo Search (CS) trained Neural Network (NN) or NN-CS based model has been proposed to detect Chronic Kidney Disease (CKD) which has become one of the newest threats to the developing and undeveloped countries. Studies and surveys in different parts of India have suggested that CKD is becoming a major concern day by day. The financial burden of the treatment and future consequences of CKD could be unaffordable to many if not detected at an earlier stage. Motivated by this, the NN-CS model has been proposed which significantly overcomes the problem of using local search based learning algorithms to train NNs. The input weight vector of the NN is gradually optimized by using CS to train the NN. The model has been compared with well-known classifiers like Multilayer Perceptron Feedforward Network (MLP-FFN) (trained with scaled conjugate gradient descent) and also with NN supported by Genetic Algorithm (NN-GA). The performance of the classifiers has been measured in terms of accuracy, precision, recall and F-Measure. The experimental results suggest that NN-CS based model is capable of detecting CKD more efficiently than any other existing model.

computer information systems and industrial management applications | 2012

A new scale for attribute dependency in large database systems

Soumya Sen; Anjan Dutta; Agostino Cortesi; Nabendu Chaki

Large, data centric applications are characterized by its different attributes. In modern day, a huge majority of the large data centric applications are based on relational model. The databases are collection of tables and every table consists of numbers of attributes. The data is accessed typically through SQL queries. The queries that are being executed could be analyzed for different types of optimizations. Analysis based on different attributes used in a set of query would guide the database administrators to enhance the speed of query execution. A better model in this context would help in predicting the nature of upcoming query set. An effective prediction model would guide in different applications of database, data warehouse, data mining etc. In this paper, a numeric scale has been proposed to enumerate the strength of associations between independent data attributes. The proposed scale is built based on some probabilistic analysis of the usage of the attributes in different queries. Thus this methodology aims to predict future usage of attributes based on the current usage.

Journal of Computational Science | 2014

Dynamic discovery of query path on the lattice of cuboids using hierarchical data granularity and storage hierarchy

Soumya Sen; Santanu Roy; Anirban Sarkar; Nabendu Chaki; Narayan C. Debnath

Abstract Analytical processing on multi-dimensional data is performed over data warehouse. This, in general, is presented in the form of cuboids. The central theme of the data warehouse is represented in the form of fact table. A fact table is built from the related dimension tables. The cuboid that corresponds to the fact table is called base cuboid. All possible combination of the cuboids could be generated from base cuboid using successive roll-up operations and this corresponds to a lattice structure. Some of the dimensions may have a concept hierarchy in terms of multiple granularities of data. This means a dimension is represented in more than one abstract form. Typically, neither all the cuboids nor all the concept hierarchy are required for a specific business processing. These cuboids are resided in different layers of memory hierarchy like cache memory, primary memory, secondary memory, etc. This research work dynamically finds the most cost effective path from the lattice structure of cuboids based on concept hierarchy to minimize the query access time. The knowledge of location of cuboids at different memory elements is used for the purpose.

international conference on emerging applications of information technology | 2012

Materialized view construction using linear regression on attributes

Partha Ghosh; Soumya Sen; Nabendu Chaki

Materialized view creation is an important aspect for large data centric applications. Materialized views create an abstraction over the actual database tables to the users. Users are not aware about the existence of these materialized views. However, these help in faster execution of query. Materialized views should contain the data that users are currently accessing, and possibly those that would be accessed in near future. Availability of the user-requested data in a materialized view indicates the efficacy of the materialized view creation process. A review of the existing research work reveals a gap in analyzing the inter-attribute affinity while creating the materialized views. This paper proposes a new methodology for materialized view creation by quantifying the association among the independent data attributes. This is done based on the usage of different attributes in the recently executed set of queries. Statistical analysis on existing query set help to predict the attributes likely to be used for future queries. The materialized views are generated accordingly.

2012 International Conference on Computing Sciences | 2012

An Architecture to Maintain Materialized View in Cloud Computing Environment for OLAP Processing

Soumya Sen; Debabrata Datta; Nabendu Chaki

Cloud Computing is an emerging technology that empowers the present day business scenario by providing services on demand instead of an integrated product. Many of applications on cloud deals with huge amount of data and these are often used for analytical processing to exploit the business intelligence. However working with very large scale of data is often time consuming and requires higher processing time. Materialized views are built and maintained to pre-fetch an effective subset of the entire database for current and immediate future usage. The materialized views are constructed on data warehouse, data marts and virtual data warehouse. In a cloud computing scenario, quite often the materialized views for the distributed data centers resides in different data servers. One of the major challenges is to handle multiple OLAP data sources. The data needs to be, integrated and analyzed continually in an efficient manner before the views are built. This paper emphasizes on integrating heterogeneous data sources to create virtual data warehouses that could be deployed in a cloud environment.

computer information systems and industrial management applications | 2017

Towards Golden Rule of Capital Accumulation: A Genetic Algorithm Approach

Sankhadeep Chatterjee; Rhitaban Nag; Soumya Sen; Amitrajit Sarkar

The current study deals with maximizing consumption per worker in connection with the economic growth of society. The traditional Solow model based approach is well-studied and computationally complex. The present work proposes a Genetic Algorithm (GA) based consumption maximization in attaining the Golden rule. An objective function derived from traditional Solow model based on depreciation rate and amount of accumulated capital is utilized. The current study considered a constant output per worker to incorporate a constant efficiency level of labor. Different ranges of Depreciation rate and accumulated capital are tested to check the stability of the proposed GA based optimization process. The mean error and standard deviation in optimization process is utilized as a performance metric. The experimental results suggested that GA is very fast and is able to produce economically significant result with an average mean error 0.142% and standard deviation 0.021%.

international conference on emerging applications of information technology | 2011

Efficient Traversal in Data Warehouse Based on Concept Hierarchy Using Galois Connections

Soumya Sen; Nabendu Chaki

This paper propose a new methodology for efficient implementation of OLAP operations using concept hierarchies of attributes in a data warehouse. The different granularity associated with a particular dimension and the hierarchy amongst those may be represented as a lattice. The focus is to move up (roll-up) and down (drill-down) within the lattice structure using an algorithm with optimal time complexity. In this paper, a new algorithm has been proposed using a dynamic data structure that reduces over time resulting in better space utilization and also reduction of computation time. A Galois Connection is identified on this lattice structure with well-defined abstraction and concretization functions based on the concept hierarchy. The contribution offers formalism in analysis using concept hierarchy in an abstract interpretation framework.

Explore More