Goal-oriented Data Warehouse Quality Measurement
AA Goal-oriented Framework for DataWarehousing Quality Measurement
Cristina Cachero and Jes´us Pardillo
Department of Software and Computing SystemsUniversity of Alicante, Spain {ccachero,jesuspv}@dlsi.ua.es
Abstract.
Requirements engineering is known to be a key factor forthe success of software projects. Inside this discipline, goal-oriented re-quirements engineering approaches have shown specially suitable to dealwith projects where it is necessary to capture the alignment betweensystem requirements and stakeholders’ needs, as is the case of data-warehousing projects. However, the mere alignment of data-warehousesystem requirements with business goals is not enough to assure bet-ter data-warehousing products; measures and techniques are also neededto assure the data-warehouse quality. In this paper, we provide a mod-elling framework for data-warehouse quality measurement ( i ∗ DWQM).This framework, conceived as an i ∗ extension, provides support for thedefinition of data-warehouse requirements analysis models that includequantifiable quality scenarios, defined in terms of well-formed measures.This extension has been defined by means of a UML profiling archi-tecture. The resulting framework has been implemented in the Eclipse development platform.
Key words:
UML, data-warehouse, goal-oriented, i ∗ , measurement, re-quirements, measurement, modelling Data-warehouse systems provide a multidimensional view of heterogeneous op-erational data sources in order to supply valuable information to decision mak-ers. Its development is usually based on the multidimensional modelling, be-cause of its intuitiveness and its support for high-performant queries [9, 10].Since the data-warehouse integrates several operational data-sources, the designof multidimensional models has been traditionally guided by supply-driven ap-proaches [7, 8]. However, in order to assure the adequation of such designs to theinformation needs of decision makers, a requirement-analysis stage is needed.For this stage, goal-oriented frameworks have proven specially suitable. The rea-son for this fact is twofold: First, goal-oriented frameworks provide constructsfor the modelling of large organisational contexts, which are the commonality indata-warehouses. Second, they match the way in which decision makers express a r X i v : . [ c s . S E ] J un C. Cachero, J. Pardillo themselves, i.e. , in terms of general expectations or objectives that the data-warehouse should support. This suitability has been materialised in proposalssuch as i ∗ DWRA [12] or the one presented in [6].However, the inclusion of goals, although necessary, may not be sufficientto guarantee the quality of data-warehouse systems. Indeed, although a goodmethodology with accurate goal definitions may lead to good and suitable data-warehouse models, many other factors could influence their quality, such as hu-man decisions. It is thus necessary to complete data-warehousing methodologieswith measures and techniques for product quality assessment [16, 17, 2, 15]. Oneof the best known techniques in this sense, emphasised in well-known softwaredevelopment processes such as the unified process (UP) [11], is the definitionof quality scenarios as part of the requirements-analysis workflow. Quality sce-narios define measures that serve to validate the requirement to which they areassociated. Furthermore, they specify the context in which the measurementprocess is to take place ( e.g. , it is not the same measuring performance with 10simultaneous users than with 10,000). Quality scenarios turn requirements intomeasurable requirements.In order to model these measurable requirements, in this paper we extend i ∗ DWRA. The result of this extension is the i ∗ -based data-warehousing qualitymeasurement framework ( i ∗ DWQM). An important advantage of this frame-work is that, to our knowledge extent, it is the first proposal that traces qualityscenarios back to the originating stakeholders’ needs. Also, our proposal stressesthe role of an often forgotten actor in data-warehouse development: the qualitymanager. Quality managers are responsible for orchestrating and leveraging thedifferent stakeholders’ interests during the data-warehouse development. Such in-terests include certain quality restrictions that must be respected for the projectto be considered successful. It is important to note how, contrary to other mea-sures proposed by different authors for i ∗ diagrams [3, 4], our emphasis is not onthe quality of the diagram per se , but on the provision of mechanisms to modelthe levels of quality required by the system under development. i *DWQM i * for DataWarehouseQuality Measurement i *DWRA i * for Data WarehouseRequirementsAnalysis[5] DecisionMaker
Quality-assuredData Warehouse
I N F O R M A T I O NQ U A L I T YOUR PROPOSAL D E R I VE QualityStake-holder
Fig. 1.
Adding measurable quality scenarios for data-warehouse requirements analysisoal-oriented Data Warehouse Quality Measurement 3
The remainder of the paper is structured as follows: we present i ∗ DWQMnext ( §
2) as an extension of i ∗ DWRA (see Fig. 1) that, for the sake of under-standability, is also sketched. Both the measurement concepts and the chosennotation are then further illustrated with a sample application ( § unified modelling language (UML) [14] profilingcapabilities, which has permitted us to implement it in the Eclipse developmentplatform ( § § Empirical research shows that the definition of measures and the description ofmeasuring efforts in literature suffer from the typical symptoms of any relativelyyoung discipline [1] and present many flaws that compromise their completenessand consistency. In order to overcome these problems, in [5] a software mea-surement ontology (SMO) has been proposed. Until the new ISO/IEC 25000standard series appear , SMO reflects a compromise solution to solve the manyinconsistencies and gaps detected in standards and research proposals. We havefollowed this ontology for the definition of i ∗ DWQM, in order to facilitate itsadoption in the measurement domain. For the sake of understandability, in Ap-pendix A the definitions of the ontology terms that have been used along thispaper are formally reproduced. Interested readers may find further informationabout the whole ontology in [5]. i ∗ DWRA
As it has been aforementioned, i ∗ DWRA is a data-warehouse requirements anal-ysis framework that has proven useful for the discovery of data-warehouse re-quirements out of business goals. This purpose is achieved by identifying the de-cisions that decision makers usually are faced to. The i ∗ DWRA framework hasbeen defined in two steps; first, in [12], a UML profile for i ∗ (the i ∗ profile, seeTable 1, col. 4 & 5) has been provided. This profile elegantly redefines i ∗ conceptsand relationships [18] in terms of UML modelling elements. These elements per-mit to model both the organisational context (by means of the i ∗ strategic depen-dency (SD) diagram) and the actors’ rationale (by means of the i ∗ strategic ra-tionale (SR) diagram) when interacting with the data warehouse. Such conceptsinclude intentional elements –actors ( ), goals ( ), tasks ( ), softgoals ( ),resources ( ), and beliefs ( )– and intentional relationships –intentional de-pendencies ( ), means-end relationships ( ), task-decompositions ( ), andcontributions ( ).Over this i ∗ profile, the second step has consisted in adding specific semanticsfor data-warehouse requirements analysis (see Table 1, col. 1–3). For the sake URL: Namely, software product quality requirements and evaluation (SQuaRE) C. Cachero, J. Pardillo of simplicity, in this table only the i ∗ elements that have been extended by the i ∗ DWRA framework are listed.
Table 1.
Mapping i ∗ DWRA concepts into the i ∗ frameworkAnalysis Concept i ∗ DWRA i ∗ ProfileStereotype Notation Stereotype UMLStrategic Goal Strategy + (cid:28) strategy (cid:29)
Goal ClassDecisional Goal Decision + (cid:28) decision (cid:29)
Goal ClassInformational Goal Information + (cid:28) information (cid:29)
Goal ClassInfo. Requirement Requirement + (cid:28) task (cid:29)
Task ClassContext Resource + (cid:28) context (cid:29)
Resource ClassMeasure Resource + (cid:28) measure (cid:29)
Resource Class
Let us now give an example to illustrate how to properly read Table 1: Letus assume that we wish to model a data warehouse information requirement(see col. 1) in i ∗ DWRA. For this task, we would have to use the
Requirement stereotype (col. 2). This stereotype provides additional semantics and notation(col. 3) to the i ∗ task concept. i ∗ tasks are mapped into UML by means of the Task stereotype (col. 4) on the UML
Class modelling element (col. 5).This modelling framework has been the basis on which we have performeda further extension to permit the definition of quality scenarios, which we havecalled i ∗ data-warehouse quality measurement ( i ∗ DWQM) framework.It is worth noting that the
Measure concept defined in i ∗ DWRA refers tothe analysis measures employed during the decision-making process supportedby the data-warehouse. Therefore, it must not be confused with the measureconcept that appears in the context of data-warehouse quality scenarios, whichwe will explore next. i ∗ DWQM
Table 2.
Mapping of SMO concepts into i ∗ DWQMSMO Concept Equivalent i ∗ ElementIndicator, Derived Measure, Base Measure GoalAnalysis Model, Measurement Function, Measurement Method TaskEntity Class, Decision Criteria ResourceAttribute Belief
The i ∗ DWQM framework enriches i ∗ DWRA with the capability of specifyingquantifiable quality scenarios for the assurance of quality requirements associatedwith data-warehouses. As we have aforementioned, our proposal is based on oal-oriented Data Warehouse Quality Measurement 5
SMO [5], in order to facilitate its understandability and help in its adoption byquality stakeholders.
Mapping Quality Stakeholders into Actors.
In order to model quality stakehold-ers in i ∗ DWQM, following the i ∗ DWRA proposal, we use the i ∗ actor modellingelement. For instance, quality managers are actors that are in charge of definingquantifiable quality scenarios. Modelling Quality Scenarios for Data Warehouses. i ∗ DWQM establishes a cor-respondence between particular SMO measurement concepts and more gen-eral i ∗ DWRA concepts. Such mapping is presented in Table 2. In this ta-ble, we can observe how SMO measures (
Indicator s, Derived Measure s and
Base Measure s) are mapped into measurement
Goal s that can be achievedthrough certain
Task s. These tasks are, namely, performing an
Analysis Model ,a Measurement Function or a
Measurement Method , respectively. The mea-surement concepts
Entity Class and
Decision Criteria are mapped into
Resources , while the
Attribute concept is mapped into a
Belief in i ∗ DWRA.Similarly, i ∗ DWQM maps the SMO relationships into i ∗ intentional relation-ships in a hierarchical manner, as can be seen in Fig. 2. : Indicator : AnalysisModel: DerivedMeasure : MeasurementFunction: DecisionCriteria : BaseMeasure : MeasurementMethod: EntityClass QualityStakeholder
Has U s e s Uses U s e s C a l c u l a t ed w i t h C a l c u l a t ed w i t h U s e s : BaseMeasure : MeasurementMethod U s e s U s e s : AttributeIs performed onObject Diagram for SMO Concepts i * CorrespondenceGoal Task ResourceGoal Task GoalGoal TaskTaskResourceQualityScenario BeliefIs performed on Fig. 2.
Mapping a SMO occurrence for specifying quality scenarios with i ∗ DWQM
The upper part of this figure presents a SMO-based model that represents ageneric quality scenario, while the lower part presents the equivalent i ∗ modelthat has served to i ∗ DWQM as a basis for further enrichment. In this figure, weobserve how, in SMO, an
Indicator is related with an
Analysis Model , whichin turn has one or more
Decision Criteria associated. An indicator is in fact
C. Cachero, J. Pardillo a type of
Measure that is made up of several other measures, be them
DerivedMeasure s or
Base Measure s. Derived measures are related with
MeasurementFunction s, while base measures are associated with
Measurement Method s. InSMO, analysis models, measurement functions and measurement methods canbe related with
EntityClass es through
Attribute s.The counterpart i ∗ DWQM relationships are presented in the lower part ofFig. 2. In this figure, we observe how indicators (mapped into goals) and anal-ysis models (tasks) are related through a
Means-End relationship. The samerelationship appears between derived measures (goals) and their correspond-ing measurement functions (tasks) and base measures (goals) and their corre-sponding measurement methods (tasks). Another relevant relationship is that of
Task Decomposition that appears between analysis models (tasks) and theirrelated measures (goals) or decision criteria (resources), and also between mea-surement functions (tasks) and the associated measures (goals), or between mea-surement methods (tasks) and the associated entity classes (resources). Last, a
Contribution relationship provides the attribute (belief) that permits the con-nection between measurement methods (tasks) and entity classes (resources).With this structure it is possible for quality managers to specify the qualityscenarios associated with data-warehouse requirements.
Modelling Dependencies among Stakeholders.
The quality scenarios modelledwith i ∗ DWQM must be connected with particular non-functional requirements(softgoals) in the context of a particular data-warehouse informational scenario.Such information requirements and the associated softgoals provide respectivelythe context and the rationale for the measurement activity. As we have afore-mentioned, i ∗ DWRA information requirements are modelled as analysis
Tasks .Analysis tasks may have different softgoals associated, which specify how thedecision maker expects those tasks to be performed. At this point, existing qual-ity models for data warehouses [15] are useful to choose among the set of non-functional requirements that are typical of this type of applications.From the existing relationship between softgoals and measures, a dependencybetween the corresponding actors can be inferred. In i ∗ DWQM, this dependencyis modelled as an intentional dependency from decision makers’ softgoals ( de-pender ) to quality stakeholders’ goal indicators ( dependee ) in order to achievea given quality scenario goal ( dependum ) that a particular quality stakeholderknows how to measure. This modelling solution can be seen in Fig. 3. Modelling Measurement Attributes.
In the mapping presented so far, some SMOconcepts, namely the measure characteristics
Unit Of Measurement , Scale , and
Type Of Scale are still missing. For the modelling of these concepts, i ∗ DWQMhas made use of the notes mechanism enabled in i ∗ . Furthermore, the UMLscaffolding that we have used to implement i ∗ DWQM ( §
4) further provides thenecessary level of formalism to properly specify these SMO concepts. oal-oriented Data Warehouse Quality Measurement 7
The case study chosen to illustrate our approach consists in a company sellingautomobiles across several countries. In this example (see Fig. 3), we have identi-fied a sales manager as an actor that has several information requirements to befulfilled by the data warehouse to be developed. During the requirements discov-ery phase, we have identified that automobile sales be increased is a strategic goalof the sales manager. From this strategic goal, several decision goals have beenderived: sales price be decreased , promotions be determined , and so on. Focusingon the first decision goal, we have obtained two information goals: automobileprice be analysed and automobile sales be analysed . Concerning the first one, wehave recognised that the information requirement analyse automobile sale price is the means for achieving this decision goal, and for this analysis, the salesmanager needs to check the prices and automobiles as fact and dimensions ofthe data warehouse analysis, respectively. AM_RepFlexLev(analysis model)
Report(entity class) {RDT,DST,LCT}.unit="seconds"{RDT,DST,LCT}.scale="Natural"{RDT,DST,LCT}.typeOfScale=ratio
RDT(R)<60 = OK (decision criteria)
AnalyseAutomobileSale Price«requirement»«context»Automobile«measure»Price Flexibility[Reporting] Report Flexibility Level (indicator)
DST(R)(base measure) LCT(R)(base measure)DST(R)+LCT(R)(m. function)RDT(R) (derived measure)
Automobile Salesbe IncreasedSale Pricebe DecreasedPromotionsbe DeterminedAutomobile Sales be Analysed Automobile Pricebe AnalysedSales«context»Date «decision»«information» «information»«strategy»«decision» «businessProcess»SalesManager QualityManager
Time LCT(R) (m. method)
Time DST(R) (m. method) StructuralComplexity(attribute)Ad-hoc Reporting(quality scenario)
Fig. 3. i ∗ DWQM model for the ad-hoc reporting quality scenario
In addition, the analysis of the automobile sales price also needs the system tobe flexible, where by flexible we refer to the extent to which the data-warehousesoftware facilitates ad-hoc reporting [15] (see Fig. 3). The quality scenario as-sociated with this softgoal has been defined by the quality manager as follows:“A sales manager is able to design the required report, based on her mentalmodel of the data warehouse, in less than 60 seconds” (referred to as “Ad-hocReporting” in Fig. 3). In Fig. 3, this quality scenario has been modelled with
C. Cachero, J. Pardillo the aid of a report flexibility level indicator that evaluates the time it takes tothe sales manager to design reports ad hoc . This indicator relies on a derivedmeasure called report design time ( RDT(R) ) that, measured over a given report,returns the number of seconds that it takes to the sales manager to actually de-sign the report. The quality scenario establishes that no more than 60 seconds isan acceptable time interval. This fact is captured in the decision criteria associ-ated with the indicator. This measure is calculated through the sum of two basemeasures (measurement function), namely the data selection time ( DST(R) ) andthe layout composition time ( LCT(R) ). These measures are assigned values byapplying the corresponding measurement method, which consists in both casesin timing the corresponding tasks over a given report (the entity class). Thebelief in Fig. 3 indicates that these measures evaluate the structural complexityattribute associated with the report. Last, the unit of measurement, scale, andtype of scale concepts are specified as additional notes in Fig. 3.
The i ∗ DWQM has been implemented as an extension of UML and has beendeployed in the
Eclipse development platform (Fig. 4). Specifically, UML pro-vides a standard extension formalism: UML profiles. These profiles consist ofa set of stereotypes for particular UML modelling elements and some relatedtag definitions and constraints that, together, permit UML to host our mod-elling language. The i ∗ DWQM profile is based on two preexisting UML profilesfor modelling i ∗ diagrams adapted to the data-warehousing discipline, i.e. , the i ∗ DWRA and the i ∗ profile [12] (see Fig. 4). So far, we have presented the map-pings that support the definition of the necessary stereotypes for properly rep-resenting the i ∗ DWRA modelling elements in UML. While some concepts havebeen directly mapped to i ∗ elements, others (namely, Unit Of Measurement , Scale , and
Type Of Scale concepts), which in a pure i ∗ framework can bemodelled as notes, have been implemented in our profile as tag definitions asso-ciated with the measurement-related stereotypes. In addition to the modellingelements, i ∗ DWQM also considers the required constraints (derived from SMO)that assure the right use of these stereotypes, e.g. , forcing that only analysismodels be the means for achieving an indicator. In this way, we provide a coher-ent modelling environment for (i) analysing data-warehouse requirements and(ii) associating a quantitative means (through measures) to assess their quality.
In this paper, we have presented i ∗ DWQM, a modelling framework to specifymeasurable quality scenarios that contribute to the assessment of the qualitywith which data-warehouse requirements are achieved. The completeness andunambiguity of the framework is facilitated by the use of a well-known SoftwareMeasurement Ontology [5] for its definition. Moreover, the UML scaffolding on oal-oriented Data Warehouse Quality Measurement 9 «profile» i *«profile» i *DWRA«import» «import»«use» «profile» i *DWQM Eclipse «platform» i * : «goal», «task», «resource», ... i *DWRA : «strategy», «decision», «fact», «dimension» ... i *DWQM : «indicator», «measure», «analysisModel», «attribute» ... + Semantics + Tag definitions + Constraints + NotationSTEREOTYPES Fig. 4.
The implemented UML profiling architecture for modelling with i ∗ DWQM which our approach is based contributes to achieving the desired degree of porta-bility.The use of our framework complements existing goal-oriented approaches forthe development of data warehouses with several additional advantages: – It increases the weight of quality scenarios and quality managers in themodelling process. – It adds emphasis to the, often forgotten, measurable aspect that should bealways associated with requirements in order to decrease the risks associatedwith system development. – It provides a means to reason about how such measurement should takeplace, with the final goal of orchestrating and leveraging the different stake-holders’ interests during the data-warehouse development. – It provides traceability between the quality scenarios and the particularstakeholders’ needsAdditionally, the measurement domain also obtains benefits out of beingassociated with goal-oriented approaches, among which we would like to stressout the provision by these approaches of a much richer organisational contextthan the one provided by the SMO for the definition of measures.Although this framework has been devised for its application to data ware-houses, their characteristics make us believe that i ∗ DWQM can be equally use-ful for other domains. This hypothesis constitutes one of our future lines ofresearch. Last but not least, measuring models open the path for model-drivendata-warehouse development frameworks (see e.g. [13]) to take them into accountfor the automatic generation of application tests.
This work has been supported by the projects: TIN208-00444, ESPIA (TIN2007-67078) from the Spanish Ministry of Education and Science (MEC), QUASI-MODO (PAC08- 0157-0668) from the Castilla-La Mancha Ministry of Educa-tion and Science (Spain), and DEMETER (GVPRE/2008/063) from the Valen-cia Government (Spain). Jose-Norberto Maz´on and Jes´us Pardillo are funded byMEC under FPU grants AP2005-1360 and AP2006-00332, respectively.
References
1. Lionel C. Briand, Sandro Morasca, and Victor R. Basili. An Operational Processfor Goal-Driven Definition of Measures.
IEEE Trans. Software Eng. , 28(12):1106–1125, 2002.2. Samira Si-Said Cherfi and Nicolas Prat. Multidimensional Schemas Quality: As-sessing and Balancing Analyzability and Simplicity. In
ER (Workshops) , pages140–151, 2003.3. Xavier Franch. On the Quantitative Analysis of Agent-Oriented Models. In
CAiSE ,pages 495–509, 2006.4. Xavier Franch, Gemma Grau, and Carme Quer. A Framework for the Definitionof Metrics for Actor-Dependency Models. In RE , pages 348–349, 2004.5. F´elix Garc´ıa, Manuel F. Bertoa, Coral Calero, Antonio Vallecillo, Francisco Ruiz,Mario Piattini, and Marcela Genero. Towards a consistent terminology for softwaremeasurement. Inform. Software Tech. , 48(8):631–644, 2006.6. Paolo Giorgini, Stefano Rizzi, and Maddalena Garzetti. Goal-oriented requirementanalysis for data warehouse design. In
DOLAP , pages 47–56, 2005.7. Matteo Golfarelli, Dario Maio, and Stefano Rizzi. The Dimensional Fact Model: AConceptual Model for Data Warehouses.
Int. J. Coop. Inf. Syst. , 7(2-3):215–247,1998.8. Bodo H¨usemann, Jens Lechtenb¨orger, and Gottfried Vossen. Conceptual datawarehouse modeling. In
DMDW , page 6, 2000.9. William H. Inmon.
Building the Data Warehouse . Wiley, 2005.10. Ralph Kimball and Margy Ross.
The Data Warehouse Toolkit . Wiley, 2002.11. Craig Larman.
Applying UML and Patterns : An Introduction to Object-OrientedAnalysis and Design and Iterative Development . Prentice Hall, 2004.12. Jose-Norberto Maz´on, Jes´us Pardillo, and Juan Trujillo. A Model-Driven Goal-Oriented Requirement Engineering Approach for Data Warehouses. In
ER Work-shops , pages 255–264, 2007.13. Jose-Norberto Maz´on and Juan Trujillo. A MDA approach for the development ofdata warehouses.
Decis. Support Syst. , 45(1):41–58, 2008.14. Object Management Group. Unified Modeling Language (UML), version 2.1.1. , February 2007.15. F´a Rilston Silva Paim and Jaelson Castro. Enhancing Data Warehouse Designwith the NFR Framework. In
WER , pages 40–57, 2002.16. Manuel A. Serrano, Coral Calero, Houari A. Sahraoui, and Mario Piattini. Em-pirical studies to assess the understandability of data warehouse schemas usingstructural metrics.
Software Qual. J. , 16(1):79–106, 2008.17. Pannos Vassiliadis.
Data Warehouse Modeling and Quality Issues . PhD thesis,National Technical University of Athens, 2000.18. Eric S. K. Yu. Towards Modeling and Reasoning Support for Early-Phase Require-ments Engineering. In RE , pages 226–235, 1997.oal-oriented Data Warehouse Quality Measurement 11 A Software Measurement Ontology Terms Definition