Anjana Gosain
Guru Gobind Singh Indraprastha University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anjana Gosain.
Requirements Engineering | 2008
Naveen Prakash; Anjana Gosain
We focus exclusively on the issue of Requirements engineering for Data Warehouses (DW). Our position is that the information content of a DW is found in the larger context of the goals of an organization. We refer to this context as the organizational perspective. Goals identify the set of decisions that are relevant which in turn help in determining the information needed to support these. The organizational perspective is converted into the technical perspective, which deals with the set of decisions to be supported and the information required. The latter defines Data warehouse contents. To elicit the technical perspective, we use the notion of an informational scenario. It is a typical interaction between a DW system and the decision maker and consists of a sequence of pairs of the form, . We formulate an information request as a statement in an adapted form of SQL called Specification SQL. The proposals here are implemented in the form of an Informational Scenario Engine that processes informational scenarios and determines Data Warehouse Information Contents.
international conference on conceptual modeling | 2004
Naveen Prakash; Yogesh Singh; Anjana Gosain
We propose a requirements elicitation process for a data warehouse (DW) that identifies its information contents. These contents support the set of decisions that can be made. Thus, if the information needed to take every decision is elicited, then the total information determines DW contents. We propose an Informational Scenario as the means to elicit information for a decision. An informational scenario is written for each decision and is a sequence of pairs of the form . A query requests for information necessary to take a decision and the response is the information itself. The set of responses for all decisions identifies DW contents. We show that informational scenarios are merely another sub class of the class of scenarios.
ACM Sigsoft Software Engineering Notes | 2011
Anjana Gosain; Sushama Nagpal; Sangeeta Sabharwal
Multidimensional conceptual models have been accepted as the foundation for data warehouse designs. The quality of these models have significant effect on the quality of data warehouse and hence, in turn on the information quality. Few researchers have defined quality attributes for the conceptual models for data warehouse and have also proposed metrics to assess the quality attributes of these models objectively. The objective of this work is to propose candidate metrics to compute the structural complexity of multidimensional model. The main emphasis of this paper will be on the dimension hierarchies in multidimensional model. Though, these hierarchies play very significant role in analysing data at various granularity levels, their use enhances structural complexities of multidimensional model which can affect their understandability and modifiability and in turn maintainability.
2009 International Conference on Intelligent Agent & Multi-Agent Systems | 2009
Anjana Gosain; Amit Kumar
Data mining is an interesting field of research whose major objective is to acquire knowledge from large amounts of data. With advances in heath care related research, there is a wealth of data available. However, there is a lack of effective analytical tools to discover hidden and meaningful patterns and trends in data, which is essential for any research.
Engineering Applications of Artificial Intelligence | 2013
Prabhjot Kaur; A. K. Soni; Anjana Gosain
A new data clustering algorithm Density oriented Kernelized version of Fuzzy c-means with new distance metric (DKFCM-new) is proposed. It creates noiseless clusters by identifying and assigning noise points into separate cluster. In an earlier work, Density Based Fuzzy C-Means (DOFCM) algorithm with Euclidean distance metric was proposed which only considered the distance between cluster centroid and data points. In this paper, we tried to improve the performance of DOFCM by incorporating a new distance measure that has also considered the distance variation within a cluster to regularize the distance between a data point and the cluster centroid. This paper presents the kernel version of the method. Experiments are done using two-dimensional synthetic data-sets, standard data-sets referred from previous papers like DUNN data-set, Bensaid data-set and real life high dimensional data-sets like Wisconsin Breast cancer data, Iris data. Proposed method is compared with other kernel methods, various noise resistant methods like PCM, PFCM, CFCM, NC and credal partition based clustering methods like ECM, RECM, CECM. Results shown that proposed algorithm significantly outperforms its earlier version and other competitive algorithms.
International Journal of Information Quality | 2011
Anjana Gosain; Sangeeta Sabharwal; Sushama Nagpal
Data warehouses are large repositories designed to enable the knowledge workers to take better and faster decisions. Due to its significance in strategic decisions, there is a need to assure data warehouse quality. One of the factors affecting the data warehouse quality is multidimensional model quality. Although there are some useful guidelines for designing good multidimensional data models, but objective indicators, i.e., metrics are needed to help designers to develop quality multidimensional models. Few researchers have proposed quality metrics for multidimensional models for data warehouse. These metrics need to be theoretically as well as empirically validated in order to prove their practical utility. In this paper, empirical validation using controlled experiment is carried out. We not only evaluate the effect of individual metric but also evaluate the effect of various combinations of metrics on data warehouse model quality specifically understandability, in order to best explain the variance of dependent variable due to independent variables. The results show that these metrics may be used as objective indicators for understandability. Finally, accuracy of our model in predicting the multidimensional model quality is also evaluated.
ACM Sigsoft Software Engineering Notes | 2009
Manoj Kumar; Anjana Gosain; Yogesh Singh
In recent years, a number of requirements engineering (RE) proposals for a data warehouse (DW) systems have been made. In the traditional/operational systems, requirements engineering has been divided into two phases: early & late requirements engineering phase. Most of the data warehouse requirements engineering (DWRE) approaches have not distinguished early requirements engineering phase from late requirements engineering phase. A very few approaches are seen in the literature that explicitly model early & late requirements for a DW. In this paper, we propose an AGDI (Agent-Goal-Decision-Information) model to support early requirements engineering issues for a data warehouse. Here, early requirements have been modeled through organization modeling and goal modeling activities as an illustration of proposed AGDI model to support decisional goals of the organization for which DW is to be built.
International Journal of Systems Assurance Engineering and Management | 2014
Manoj Kumar; Anjana Gosain; Yogesh Singh
Data warehouse (DW) quality depends on its data models (conceptual, logical and physical model). Multidimensional (MD) modeling has been widely recognized as the backbone of data modeling for DW. Recently, some of the authors have proposed a set of structural metrics to assess quality of MD conceptual models. They have found the significant relationship between metrics and understandability of DW conceptual schemas using various correlation analysis techniques such as Spearman’s, Pearson etc. However, advanced statistical and machine learning methods have not been used to predict effect of each metric on understandability. In this paper, our focus is on predicting the effect of structural metrics on understandability of conceptual schemas using (i) statistical method (logistic regression analysis) that include univariate and multivariate analysis, (ii) machine learning methods (Decision Trees, Naive Bayesian Classifier) and (iii) compare the performance of these statistical and machine learning methods. The results obtained show that some of the metrics individually have a significant effect on the understandability of MD conceptual schema. Further, few of the metrics have a significant combined effect on understandability of conceptual schema. The results also show that the performance of Naive Bayesian Classifier prediction method is better than logistic regression analysis and Decision Trees methods.
ACM Sigsoft Software Engineering Notes | 2012
Hemant K. Jain; Anjana Gosain
A data warehouse mainly stores integrated information over data from many different remote data sources for query and analysis. The integrated information at the data warehouse is stored in the form of materialized views. Using these materialized views, user queries may be answered quickly and efficiently as the information may be directly available. These materialized views must be maintained in answer to actual relation updates in the different remote sources. One of the issues related to materialized views is that whether they should be recomputed or they should be adapted incrementally after every change in the base relations. View maintenance is the process of updating a materialized view in response to changes to the underlying data is called view maintenance. There are several algorithms developed by different authors to ease the problem of view maintenance for data warehouse systems. In this paper, we have provided a comprehensive study on research works of different authors related to DW view maintenance considering various parameters and presented the same in tabular way.
International Journal of Advanced Computer Science and Applications | 2011
Garima Thakur; Anjana Gosain
Data in a warehouse can be perceived as a collection of materialized views that are generated as per the user requirements specified in the queries being generated against the information contained in the warehouse. User requirements and constraints frequently change over time, which may evolve data and view definitions stored in a data warehouse dynamically. The current requirements are modified and some novel and innovative requirements are added in order to deal with the latest business scenarios. In fact, data preserved in a warehouse along with these materialized views must also be updated and maintained so that they can deal with the changes in data sources as well as the requirements stated by the users. Selection and maintenance of these views is one of the vital tasks in a data warehousing environment in order to provide optimal efficiency by reducing the query response time, query processing and maintenance costs as well. Another major issue related to materialized views is that whether these views should be recomputed for every change in the definition or base relations, or they should be adapted incrementally from existing views. In this paper, we have examined several ways o performing changes in materialized views their selection and maintenance in data warehousing environments. We have also provided a comprehensive study on research works of different authors on various parameters and presented the same in a tabular manner.
Collaboration
Dive into the Anjana Gosain's collaboration.
Ambedkar Institute of Advanced Communication Technologies and Research
View shared research outputs