"How much?" Is Not Enough - An Analysis of Open Budget Initiatives
Alan Freihof Tygel, Judie Attard, Fabrizio Orlandi, Maria Luiza Machado Campos, Sören Auer
""How much?" Is Not EnoughAn Analysis of Open Budget Initiatives
Alan Tygel
Graduate Program onInformatics – PPGI – UFRJ,Brazil [email protected] Judie Attard
University of Bonn, Germany [email protected] Fabrizio Orlandi
University of Bonn, Germany [email protected] Luiza MachadoCampos
Graduate Program onInformatics – PPGI – UFRJ,Brazil [email protected] Sören Auer
University of Bonn andFraunhofer IAIS, Germany [email protected]
ABSTRACT
A worldwide movement towards the publication of Open Govern-ment Data is taking place, and budget data is one of the key ele-ments pushing this trend. Its importance is mostly related to trans-parency, but publishing budget data, combined with other actions,can also improve democratic participation, allow comparative anal-ysis of governments and boost data-driven business. However, thelack of standards and common evaluation criteria still hinders thedevelopment of appropriate tools and the materialization of the ap-pointed benefits. In this paper, we present a model to analyse gov-ernment initiatives to publish budget data. We identify the mainfeatures of these initiatives with a double objective: (i) to drive astructured analysis, relating some dimensions to their possible im-pacts, and (ii) to derive characterization attributes to compare ini-tiatives based on each dimension. We define use perspectives andanalyse some initiatives using this model. We conclude that, in or-der to favour use perspectives, special attention must be given touser feedback, semantics standards and linking possibilities.
General Terms open government data, open budget initiatives, e-government, par-ticipation, transparency
1. INTRODUCTION
In the last six years, a worldwide movement towards the publica-tion of Open Government Data (OGD) has been taking place. Theaims and scope of OGD initiatives in each country are diverse andwe can count almost 100 countries publishing some kind of OGD . According to the Open Data Index: http://index.okfn.org/ . Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies are notmade or distributed for profit or commercial advantage and that copies bearthis notice and the full citation on the first page. Copyrights for componentsof this work owned by others than ACM must be honored. Abstracting withcredit is permitted. To copy otherwise, or republish, to post on servers or toredistribute to lists, requires prior specific permission and/or a fee. Requestpermissions from [email protected] 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00 http://dx.doi.org/10.1145/2506182.2506197 . The motivation for governments to publish OGD are also di-verse. It ranges from the democratic point of view with increas-ing government transparency and citizen participation to the moreeconomic motivation of fostering new data-driven businesses. Thestrengthening of law enforcement has also fostered OGD publish-ing [10].A large number of stakeholders may take part of the OGDecosystem, namely: as data providers, different levels of publicadministrations (including local, regional, national and transna-tional), and citizens, and as consumers, civil society initiatives andNGOs, companies, journalists and media organisations. While dataproviders mostly play the specific role of publishing data in an openformat, other stakeholders participate in this initiative in a numberof ways, including viewing the open data, sharing feedback, andexporting data into their own systems. It is also expected that thesestakeholders behave as prosumers , not only passively consumingdata, but also interfering in its production and publication.OGD can be related to a diversity of themes. Education, crime,health, transportation and company registration are common sub-jects. However, one type of data is of particular importance: gov-ernment budgetary data, as timely access to these data is critical toaccomplish government accountability.All governments and public administrations maintain budgetarydata, unlike, for example, bus position data, which depends on sen-sors, or data about the occurrence of a specific disease, which de-pends on a health information system. From the citizen side, in-formation on budget is a key element to ensure that public fundsare being properly used. In locations where a participatory budgetwas implemented, that is, part of the budget allocation is decidedby the community, access to this kind of data is indispensable. Aglobal initiative to improve openness of governments – the OpenGovernment Partnership (OGP) – has the fiscal transparency as aminimum eligibility criteria , characterizing budget data as a foun-dation of open government.Even with so many possible positive impacts, existing public fi-nancial transparency portals suffer from a number of shortcomings.First of all, they suffer from the large number of diverse data struc-tures that make the comparison and aggregate analysis of transna-tional financial flows practically impossible. The tools to present,search, download and visualise this financial data are also nearly as Other criteria can be found at http://bit.ly/1929F1l . a r X i v : . [ c s . C Y ] A p r iverse as the number of existing portals. This heterogeneity [24]may even prevent an analysis of the quality of the data for the samefunds administered by different funding authorities. Past effortshave sought to overcome this situation by creating comprehensiveand connected transparency portals, such as Farmsubsidy.org, andmore recently, Publicspending.net.Within the existing open budget initiatives, low user engagementhas been reported [28]. Moreover, most of the budget publishingefforts result in simple data catalogues, fragmented and dispersed,because they do not share standards and methodologies [24]. Theabsence of standards can lead to data misuse [30], or even to resultsopposed to the initial aims [8].The basis for such standards has to be set. Together with otherongoing initiatives [15, 26], we believe that the development ofa solid standard can help governments to make their budget datamore usable, and thus enable citizen participation in the democraticprocess. In this article we define a structured analysis frameworkfor budget data, which can help developers and policy makers to un-derstand the importance of various aspects of budget data publish-ing and to develop more adequate budget publishing systems. Afterdefining some foundational concepts (Section 2), highlighting theimportance of budget data (Section 3) and discussing related work(Section 4), we describe the chosen methodology (Section 5) andderive dimensions and characterization attributes, based on threeuse perspectives (Section 6). These characterization attributes areapplied to 23 open budget initiatives (Section 7), and results arediscussed (Section 8).
2. WHAT IS BUDGET DATA?
Open Budget Data is the topic of a few recent publications [26,23, 14, 16, 4]. Nevertheless, it is important to establish a com-mon ground to some basic concepts, as they have not always a sin-gle widely accepted definition. Here, we propose definitions for
Budget , Spending and
Revenue , as the main quantities tackled, and
Open Budget Data as the general term .D EFINITION Budget is the description of the amount ofmoney planned to be spent in a specified time period. Budget de-scriptions can refer to several levels of specificity, from general (to-tal amount to be spent) to specific (amount by area, or category). Abudget description can be characterized by: • (i) the scope, that is, the corresponding administrative level(municipality, region, country etc.); • (ii) Optionally, a domain, such as healthcare, public trans-portation; • (iii) if applicable, the related location (region, city, neigh-bourhood, or latitude and longitude) and • (iv) a period of time.Budget comprises a set of budget items which have a budget cate-gory and an associated amount with a currency. Categories can beorganized hierarchically, where higher levels of the hierarchy arerepresenting aggregations of the lower levels. There are differenttypes of budget, such as proposed, planned, and certified, which ispresented after the budget term. Budgets may also receive amend-ments during their associated term. A further discussion about these terms can be found in http://community.openspending.org/research/handbook/types-of-spending-data/ . D
EFINITION Spending , or expenditure, refers to the amountof money actually spent by the public administration. It can also beseen as the realisation of the budget. Government spending can besplit in four main categories: • (i) Transfer payments, related to social benefits as pension,housing, or floor income for low income households; • (ii) Current government spending, related to the costs ofmaintaining the government structure, mainly public em-ployees salaries; • (iii) Capital spending, which goes for building infrastucture,as roads, hospitals, schools etc; and • (iv) Financial costs, as internal and external debt services.Ideally, spending should be published in the finest grain: transac-tions, which is the description of every payment, including value,time period and recipient. Transactions should also be classi-fied according to properly defined criteria to generate aggregateamounts. These criteria are the same as specified for the budget:scope, domain, place and time. There exists also different typesof spending, such as planned (according to the budget), authorized(payment order) and executed (money transferred from governmentto the recipient). D EFINITION Revenue is the amount of money received bya government administration. Revenues can have several types oforigins, such as taxes (revenue, commercialization), service fees(transportation), royalties (oil and mine exploration), concessions(roads, electromagnetic spectrum) or financial operations. Pre-dicted revenues, used to specify the budget may differ from the ac-tual revenues. D EFINITION Open Budget Initiative refers to any portal orapplication which publishes budget, spending and/or revenue data,that allows the civil society – IT experts or not – to access thosedata. It may comprise one or many datasets, which can be down-loaded in several formats or directly visualized in tables, charts ormaps. The model presented in Section 6 describe an Open BudgetInitiative in further details.
3. WHY BUDGET DATA?
The importance of publishing government budgetary data can besummarised in five key elements:
Transparency : Opening budget data unveils public funds man-agement. This increases accountability and therefore augmentscitizen’s trust in public administration, whilst having a potentialof uncovering hidden transactions and thus preventing corruption.An important factor which can stimulate corruption is the fact thatfunding goes through the hands of public officials without furtherscrutiny. In European Union Member States, this is particularlyevident within public procurement, which is prone to corruptionowed to deficient control mechanisms [6]. Essentially, such acts areconcealed from the public eye. Supporting financial transparencyenhances accountability within public sectors and, as a result, pre-vents corruption.
Participation : Opaque regimes may compel citizens to engageagainst the government. A transparent public administration, on thecontrary, can stimulate social participation in community enhance-ment. Open budget initiatives can not only enable meaningful civilsociety scrutiny of transnational financial flows, but they can alsorovide platforms for stakeholders to develop benchmarks that cre-ate pressure on public authorities to provide data in a timely, com-parable, re-useable and well-structured fashion. These platformscan also involve local citizens in the budget planning and auditingphases, by allowing them to interact with the process, providingopinions and suggestions on setting budget priorities, providingfeedback on the published transactions. A virtuous circle can becreated, in which both public officials and civil society will realisethe value of data and analysis tools, in a collaborative environmentopen to contributions and engagement.
Comparative Analysis : Well organized budget data facilitates re-searchers and policy makers to compare spending strategies be-tween cities, states and countries, and also among different admin-istration levels. Visualisation, analytics and exploration tools canoffer different stakeholders an opportunity to scrutinize and inter-pret financial data related to a region of interest. It also allows tocompare allocations and transactions between multiple regions, tovisualise detected trends and budget projections and to investigateanomalies and activities, which have been flagged as suspicious. Anecessary condition for that is the compatibility and consistency ofdata from different data sources.
Efficiency and Effectiveness : Efficiency of public spending canbe assessed by comparing, for example, the cost per kilometre of arailway. The effectiveness can also be assessed, in this case, by therevenues generated with the railway.
Business Value : It has been recently stated that "Open data canhelp unlock U$3 trillion to U$5 trillion in economic value annu-ally" [13]. Publishing budget data can stimulate the creation, deliv-ery and use of new services on a variety of devices, utilising newweb technologies, coupled with open public data. These servicesinclude visualisation services and data discovery services, such asdata mining and comparative analysis, which enable stakeholdersto explore the data, identify patterns, as well as potentially fore-casting budget and transaction trends. Budget data can also gen-erate value by empowering journalists when they report on spend-ing items. Accurate information on public funds usage may enablecontent producers to create better articles.
4. RELATED WORKS
A number of recent works proposed frameworks, impact mea-sures or comparison criteria on the general open data domain.Some of them aim the comparison of e-government and open datapolicies [29, 25]. In [22], a framework is proposed to evaluate OGDinitiatives, pointing also to the development of impact metrics.A theoretical background to analyse the impact of OGD was de-veloped in [7]. Impacts are divided into economical, political andsocial, and for each of them, possible implementation issues andimpact metrics are deeply discussed. Recently, a working groupwas created to develop methods for assessing open data. In theirfirst report [3], a draft of a framework is proposed.Automatic benchmarking techniques are proposed in [1]. De-spite enabling large scale with low cost and high frequency eval-uations, automatic assessment can miss some political and socialaspects of open data.Even though structured analysis and comparison of open bud-get initiatives have not received much attention from the literature,two works must be highlighted. The Open Budget Survey [11] is aresearch project that, every two years, "measures the state of bud-get transparency, participation, and oversight in countries around the world". It generates the Open Budget Index , which is updatedmonthly, and is based on the publication of eight key budget docu-ments. Despite being a very useful comparison tool, this methodol-ogy does not evaluate information systems used to publish budgetdata, which are the way how the information reaches the society.An evaluation and comparison between almost 30 Brazilian gov-ernment transparency portals, on several administration levels, ispresented in [2]. The analysis was based on the 8 Open Govern-ment Principles evaluated for each portal by experts. Despite be-ing a well defined and wide accepted model, these principles arequite general, and do not refer to specific characteristics of budgetdata. Moreover, they cover basically the publisher side.
5. METHODOLOGY
The research approach used to develop this model was inspiredby the observation, induction and deduction method used in [29].After analysing the related bibliography and observing some ran-domly collected open budget initiatives, we used an inductive rea-soning to build the first approach to the model. The model is a set of dimensions , which represent different themes to be assessed in anopen budget initiative. Dimensions are grouped in parts , accordingto its general functions.The same basis is used to define use perspectives (UP), whichrepresent different ways of using budget data. From the UPs, weextract related requirements.Model and use perspectives were then applied to other open bud-get initiatives in a deductive reasoning, in order to verify the fitnessof the dimensions, and the coverage of the use perspectives. Miss-ing items were added to the model and to the use perspectives, andthe feedback loop was run until no significant changes were found.Finally, use perspectives were checked against the model, in orderto verify the correspondence between model dimensions and useperspectives. This correspondence is materialized in the character-ization attributes
The result of this observation, induction and deduction approachis described in the next section.
6. A MODEL TO ANALYSE OPEN GOV-ERNMENT BUDGET DATA PORTALS
The main objective for building this model is the need for amechanism to assess different strategies for publishing budget data.We do not aim to build rankings, but rather to systematize openbudget initiatives in order to assess their fitness to specific use per-spectives. A general overview of the proposed model is depicted inFigure 1. The model consists of four parts:1.
Context , referring to external aspects related with the initia-tive;2.
General Aspects , referring to the overall characterization ofthe initiative;3.
Data Publishing , referring to aspects specific to data publish-ing process; and4.
Data Consumption , referring to aspects specific to the dataconsumption process.Naturally, there is a strong coupling between these parts. Theway data are published affects directly the consumption. By thesame reasoning, the feedback generated by users (should) affect http://opengovdata.org/ sability/Design Feedback Licence Context
Objective ContentMetadata Access C o n s u m p t i o n P ub li s h i n g A s p ec t s G e n e r a l ResponsibilitySemanticsFormatsData
Figure 1: Model to analyse open budget initiatives. The four parts – General Aspects, Publishing, Consumption and Context – areinterconnected, and composed by several dimensions. Icons made by Flaticon (CC). data publishing. The context particularly impacts the general as-pects, but also influences the other parts.The context part represents the environment in which the openbudget initiative is involved. It stands for the open data policies andlegislation which rules the publication of spending data, and alsothe government initiatives to promote the use of data, either only byadvertising, or more incisively promoting data literacy. Althoughwe recognize that the context is a key element for the success ofan open budget initiative, we will not consider it in the scope ofthis paper because its complexity would make the first approach toan objective model unfeasible. For the time being, we will focuson the general aspects directly related to the initiative, and on theissues related to publishing and consuming data.Each part of the model is composed of several dimensions, whichwill be assessed through
Characterization Attributes :D EFINITION
5. Characterization Attributes are features ofopen budget initiatives that: (i) are objectively assessable; (ii) ex-pect qualitative values; and (iii) have direct impact on the realisa-tion of use perspectives.
The characterization attributes derived from the dimensions aresummarised in Table 1.Characterizing an open budget initiative is the first step in orderbe able to assess quality. The term quality may refer to differentconcepts. In this work, we define quality as the conformance torequirements, which in our case are those associated to the use per-spectives. In other words, we can say that quality is the fitness foruse. Thus, we define three use perspectives, from which we extractsome requirements:
UP1 – Transparency:
Journalists, software developers, NGOs,and grass-roots movements use budget data to audit governmentand to translate data into more accessible formats for the society.For this use case, detailed data (i.e., transaction level), consistentclassification levels, and machine readable formats are some im-portant requirements. Discussion and feedback on the provideddata are also requirements in this case, for example, for suggesting different priorities for budgeting, or discussing a particular trans-action. Both citizens and public administration benefit from thisfeature since the citizens (or other stakeholders) can show theirperspectives and the public administration entity would check thecurrent priorities to see if they need to be amended.
UP2 – Participation:
For the last two decades, cities from all overthe world have been implementing participatory budgeting (PB) ex-periences with different systems and procedures. Research showshow developing and promoting PB digital solutions can increasecivic engagement up to seven times [21]. In Europe, digital so-lutions to promote citizen engagement in budget creation include,for example, sending proposals by email, participating in onlineforums and discussion, subscription to SMS updates and videostreaming [17]. Germany presents one of the most advanced digitalsolutions to engage citizens, as shown in the participatory budget-ing portal of the city of Freiburg . The participation use perspectivewill be exemplified by the PB case. PB members must have accessto accurate and easily understandable budget data. Through thisperspective, design, usability, and human readable formats are themost important requirements. Hierarchically aggregated categoriesalso play an important role. UP3 – Policy Making:
If adequately published, budget data can beused to compare the way each government manages public funds.Researchers and policy makers should be able to compare the bud-gets and spending data between (i) different public administrations(e.g. Cologne vs Münich); or (ii) different periods (e.g. year 2013vs year 2014), and thus relate spending strategies to political, eco-nomical and social outcomes. Comparing spending profiles amonggovernments requires the use of common classifications, vocabu-laries and ontologies, and the possibility of linking data with otherdatabases, as, for example, multinational enterprises data [24]. Inorder to enable the integration of the corresponding budget data onthe different public administration contexts, a semantic data modelfor budgets and spending has to be defined. In this case, publish- ng financial data in a reusable, machine-processable, linked-dataformat can enable integration and reuse across multiple sources.The use of a standard format also facilitates the comparison of datafrom different municipalities or regions. More importantly, it al-lows all the stakeholders involved or interested in budget planningor spending, to manipulate data using the same tools and meth-ods, thus supporting financial transparency in public budgeting andspending. This may allow the creation of visualisations and com-parative data analyses for the discovery of trends. Stakeholderswill therefore be able to view and compare allocated budgets andtransactions, and give feedback on each item. This feedback canthen be shared through social media and also be directly exploitedby governments and public administrations to achieve better bud-get management. The latter two stakeholders will thus benefit fromreceiving targeted suggestions, comparative benchmarks and sce-narios.In the remainder of this section, we explain each part of themodel, by defining its dimensions, explaining their importance, andproposing characterization attributes (summarised in Table 1), inorder to assess the fitness to each use perspective. We define useras any of the stakeholders aiming to consume data from an openbudget initiative. Motivations to publish budget data, or generally open data, canbe very diverse. In Section 1, we listed five common reasons forpublishing budget data: transparency, participation, comparativeanalysis, efficiency and effectiveness assessment, and generatingbusiness value. Defining the aimed audience is also important,since different user profiles require different approaches. For exam-ple, in UP1, detailed data in machine readable formats is desirable,while for UP2, human readable charts and tables are most suitable.A SPARQL endpoint could better fit the needs of UP3.D
EFINITION The
Objective dimension represents the moti-vations alleged for publishing budget data, including the definitionof the intended audience.
Characterization attributes:
We define as a characterization at-tribute: (i) whether an initiative states clearly its objective (CA1),and (ii) whether the intended audience is explicitly defined (CA2).
Open budget initiatives are very heterogeneous regarding to thepresented content. Data can refer to several administration levels(local, regional, national), and also to the different power instances(Executive, Legislative or Judiciary), according to the political sys-tem of each country.D
EFINITION The
Content dimension has the objective of as-sessing the nature of the information contained in an open budgetinitiative.
Characterization attributes:
The first important distinction wewant to highlight is whether the initiative is exclusively for pub-lishing budget data, or it contains other kinds of information (CA3).Then, we also distinguish primary sources of data from applicationsworking over data published by other initiatives, that is, secondarydata (CA4). Finally, we assess the scope of the initiative (CA5),classifying it into local (1), regional (2), national (3) or transna-tional (4) range. A special sign identifies initiatives focused only on the legislative power (L), considering that initiatives, normallyexhibit general budget data. We also consider that the scope can begeneric (5), when the initiative allows publishers to display differ-ent datasets, referring to different scopes.
Publishing budgetary data implies a great responsibility of peo-ple in charge of the initiative. This kind of information is quitesensible, and mistakes can lead to severe consequences. Govern-ment, as supplier of primary data, may define specific sectors tobe responsible for publishing budget data. In the US, responsi-bility is under the General Services Administration, while in UK,there is a Transparency and Open Data team under the Cabinet Of-fice. In Brazil, administration is under the Ministry of Planning,Budget and Management. Organization of civil society also playan important role by building applications over primary data, spe-cially regarding UP2. In this case, responsibility lies in making thecontext clear, and simplifying as much as possible for data to beunderstood, but as little as possible to avoid misinterpretations.D
EFINITION The
Responsibility dimension of an open bud-get initiative refers to the person(s) or organization(s) responsiblefor publishing the data, from operational tasks up to guaranteeingthe authenticity of the provided information.
Characterization attributes:
We define, as a characterization at-tribute, the distinction between data provided by governments andby society (CA6). We also consider the possibility of a joint gov-ernment/society partnership.
Actions to be taken referring to these dimensions are expectedfrom data publishers, supposedly influenced by data consumers.
While the
Content dimension (6.1.2) aimed to deal with generalaspects related to the content of an open budget initiative, the
Data dimension focuses on specific aspects.D
EFINITION The
Data dimension represents specific aspectsof the data content and determines what kind of information is pos-sible to be extracted from an open budget initiative.
Characterization attributes:
In order to characterize the data con-tent, we define three characterization attributes: (i)
Measures , i.e.,the types of represented quantities, which can be budget, spendingsand/or revenues (CA7); (ii)
Dimensions , i.e., how the measures arequalified, which can be time, space and/or other categories (CA8);and (iii)
Granularity , i.e., the finest level of detail available: trans-action or aggregate (CA9). For all CAs in this dimension, we alsoaccept the generic value, when the options are not predefined andseveral datasets in the same initiative present different settings.
When data are offered for download, the format in which theyare encoded plays a very important role. For UP1, data in ma-chine readable formats are crucial. For UP3, unique identificationof entities and relations is also very important. The semantic re-sources generated by open budget initiatives can be instantly readyfor reuse, when resources follow Linked Open Data (LOD) prin-ciples and guidelines [9]. In this case, all URIs must be resolv-able and dereferenceable. These resources shall be accessible viaa SPARQL endpoint, or by directly resolving resource URIs. The SPARQL is a set of specifications to query and manipulate RDF able 1: Model parts, dimensions and characterization attributes defined to characterize an Open Budget Initiative.Model Part Dimension Characterization Attribute Possible Values
General Objective CA1: Is the objective clearly stated? Yes/NoCA2: Is the intended audience defined? Yes/NoContent CA3: Are data exclusively on budget? Yes/NoCA4: What is the source of data? Primary Source/Secondary SourceCA5: What is the scope covered by the strategy? Country/Regional/LocalTransnational/Generic, and LegislativeResponsibility CA6: Who is responsible for the strategy? Government/Society/BothPublishing Data CA7: What measures are available? Budget/Spending/Revenues/GenericCA8: What dimensions are available? Time/Place/Payer/Payee/Category/GenericCA9: What is the finest data granularity? Transaction/Aggregate/GenericFormats CA10: Which formats are available? Five Stars of Open DataMetadata CA11: Are metadata available? Yes/NoSemantics CA12: Is any ontology or vocabulary used? Yes/NoAccess CA13: How are data made available? Catalogue/Raw Data/QueryingSystem/Stories/InfographicsLicense CA14: Are data licensed? Yes/NoConsumption Usability CA15: What software tool is used? CKAN/OpenSpending/OtherFeedback CA16: Is it possible to give feedback over data? Comments/Data Request/Issue Reportinglatter must return either an RDF representation of the resource, ora more eye-friendly HTML visualisation, according to the nego-tiated content-type. It is of utmost importance that the resourcesare available on a stable server. This is also important as theseresources could be linked to others in the LOD cloud. The result-ing data, which will be in a standard interoperable format (RDF),will be fully compliant with the statement for best practices givenby the G8 Science Ministers [19]: "Data should be easily discov-erable, accessible, assessable, intelligible, useable, and whereverpossible interoperable to specific quality standards". Due to LODbeing a widespread initiative, existing tools can be exploited andused in order to reuse datasets.D
EFINITION
The
Formats dimension represents the type offormats in which downloadable data are offered by an open budgetinitiative.
Characterization attributes:
Here, we adopt the well establishedopen data five stars model as characterization attribute (CA10). Adequate metadata are fundamental for providing complemen-tary information about the context in which data are immersed. In-graph content on the Web. More on . http://5stardata.info/ formation such as dataset author, published date and last update,formats and license are usually the basic metadata. Another use-ful class of metadata is provenance. Provenance metadata describethe transformations applied to the dataset, and can also explain theprocess through which each data item was generated.D EFINITION
The
Metadata dimension refers to the avail-ability of descriptors associated to the provided datasets.
Characterization attributes:
As a characterization attribute, wecheck for the existence of metadata in an open budget initiative(CA11).
In order to be correctly interpreted, data must be contextual-ized to avoid problems that emerge from terminology ambiguityor lack of agreement. Without post-hoc unification the data maybe difficult to understand, as their users may need to familiarizethemselves with different terminologies for each dataset. Havinga single data format may solve structural heterogeneity, at the ex-penditure of the cost of introducing yet another format bridgingthe others. A more complex issue refers to semantic heterogeneity,which may be addressed by simpler solutions based on vocabular-ies to more comprehensive approaches based on ontologies. It istherefore fundamental not to multiply the competing approachesfor modelling public budgets and spending data, but rather build onrevious work, such as [15], and align divergent approaches usinglinks and semantic relations.The current repositories of public finance data, such as Open-Spending.org, serve well as data catalogues, in which each datasetexists more or less in isolation as a separate black box. The ab-sence of links and explicit semantics forms a barrier to automatedprocessing, combining, and joining datasets of distinct origin. Inthe context of such tasks, applying linked data and semantic webtechnologies offers greater data interoperability.Nevertheless, perhaps the most important are the benefits oflinked data for improving data interpretation. The key to suchimprovement comes from the recognition that measures in pub-lic budget and spending data are relative. If there is no way tocompare them and put them into context, it is difficult to makesense of the data. Putting money into a wider context, on which itwas spent, helps to perform meaningful analyses and find compre-hensible "stories" in data. The context may be provided by linkeddatasets, such as population statistics. Added links to external datacan link public finance with the LOD cloud, offering many ways toview data given different contextualising information, such as eco-nomic indicators or demographic statistics. Ultimately, a key goalof the proposed data model is to enable better comprehension ofpublic finance data.For UP3, following semantic standards is mandatory. Eventhough budget data tends to be very heterogeneous, especially be-tween different countries, some common points can be found, forexample spending categories (Health, Education, Debt Services)or international companies. Budget ontologies regarding specificcountries have been developed [20, 18], and even an internationaleffort is in course [15]. Although not providing immediate linkingpossibilities, following standards as the Special Data Dissemina-tion Standard [12] helps to make data comparable.D
EFINITION
The
Semantics dimension refers to the sup-port of any terminological complementary resource that allows abetter understanding of the data domain concepts.
Characterization attributes:
We define the
Semantics character-ization attribute as a boolean value, that indicates the presence ofstandardized vocabularies or ontologies (CA12) in the open budgetinitiative.
The simplest way of publishing budget information is by offeringdata for download, which can be done in several formats. However,in UP2, interactive charts, maps or infographics are more usefulthan downloadable datasets, even if this might not be consideredopen data in the strict sense. Thus, the
Access dimension aims tocheck the adequacy between the desired audience and the way dataare offered.D
EFINITION
The
Access dimension refers to how the ini-tiative presents budget data to its audience.
Characterization attributes:
Data Access is a characterization at-tribute (CA13) which can be assigned as: • Downloadable data, Linked Data/SPARQL endpoint; • Data and metadata catalogue; • Exploration by Tables; • Visualization by Charts, Maps, Comparison; and/or • Stories
Licensing is a fundamental issue for data reuse. In UP1, somekinds of use can be hindered by the absence of adequate licens-ing, for example, the development of derived applications. Cur-rently, 3 types of general licenses for open data are available : Pub-lic Domain Dedication and License (PDDL), Attribution License(ODC-By), and Open Database License (ODC-ODbL). Some gov-ernments developed their own open data licenses, for example, Ger-many and UK D EFINITION
The
Licensing dimension assesses the legalstatus of data available in an open budget initiative.
Characterization attributes:
We define a boolean characteriza-tion attribute to describe the existence of a license (CA14) on datapublished by an open budget initiative.
The justification and characterization attributes identified in thispaper aimed at the success of the use perspectives. In this part,we detail specific issues related to actions to be taken by the users,when interfacing with budget data.
A good set of visualisations, which are self explanatory and easyto understand, certainly can improve usage of an open budget ini-tiative. Interactive visualisations and infographics can also enablea stakeholder to focus on a particular aspect of the data. In [27],impacts of usability and design issues are discussed. The exper-iments showed how improvements on design led to better resultswith users.Several aspects of this dimension overlap with dimensions of the
Publishing part. Particularly, different ways of accessing data (
Ac-cess dimension) heavily impact usability, and exporting data in dif-ferent formats (
Formats dimension), such as CSV, XML or RDB isalso important to encourage the reuse of data. Thus, the way dataare published can enable stakeholders to get the most out of theopen data.D
EFINITION
The
Usability/Design dimension verifies if theinitiative interface is suitable to the requirements of the use per-spective.
Characterization attributes:
The complexity of analysing userinterfaces surpasses the scope of this paper. Nevertheless, we definea characterization attribute related to the software tool used by theinitiative (CA15), understanding that the tool behind the initiativeplays an important role on the usability. Possible values are thetwo major open source software tools available for publishing opendata: OpenSpending and CKAN.
In order to enable the collaboration between the public sectoradministration and the other stakeholders, open budget initiativeshave to provide means to discuss and give feedback on the provideddata. This feedback might be provided to the public administratorseither as comments or as a set of recommendations. For example,NGOs could give feedback on what should the budget focus ontheir practice area. Ideally, this communication process should be See more at: http://opendatacommons.org/licenses/ . ransparent, that is, feedback and recommendations given to pub-lic administrators should be publicly available and any changes,resulting from the feedback, should be recorded.The importance of stimulating user engagement on open datainitiatives through feedback and collaboration has been stressed bythe Five Stars of Open Data Engagement model [5]. This modeljustifies the necessity of data being demand driven, contextualized,and collaborative. The conversation around data is also pointed asa strategy to engage users. According to this model, data shouldbe regarded as a common resource, what enforces the necessity ofcollaboration. The lack of collaboration has been listed by [29]as one of the main factors hindering the development of open datapolicies.To enable collaboration, tools to allow feedback on budget al-locations and specific expenditure transactions should be providedto stakeholders. Public administrations must have the instrumentsto receive and effectively manage this feedback, enabling greaterdegrees of active citizen involvement and participation.D EFINITION
The
Feedback dimension represents theuser’s capacity to collaborate in data publishing and expressher/his opinion.
Characterization attributes:
Although this point requires adeeper analysis, we noticed that many open budget initiatives donot present any feedback support. Thus, we define one basic binarycharacterization attribute which is the existence of feedback mech-anism (CA16). We check if it is possible to: (i) comment on data;(ii) submit a new data request; and (iii) report issues noticed in dataanalysis.
7. ANALYSIS OF OPEN BUDGET DATAINITIATIVES
In this section, we describe the application of the model to anumber of open budget initiatives. The goal of this evaluation isnot to be extensive or to achieve statistical significance, but ratherto test the model, to discover its potentials and limitations, and togain some intuition on the domain. Results are shown in Table 2,and data can be accessed at http://bit.ly/1FNThhH .The 23 initiatives were chosen considering a balance betweenprimary (11) and secondary (12) sources (CA4). The sample alsocontains at least five initiatives strongly related to each use perspec-tive, and considers initiatives from 6 countries plus the EuropeanUnion, presented in five different idioms. Some of the analysedinitiatives are listed on the
Map of Spending Projects .All primary sources are maintained by the government, and mostof the secondary ones are society driven. Among them, two initia-tives were identified as maintained in partnership between govern-ment and society organizations (CA6). Initiatives generally displaytheir objectives (22 - CA1), but only 11 explicitly mention their in-tended audience (CA2). Also, almost all initiatives offer data fordownload (18), which favours UP1, and more than half of them(13) make visualization available, favouring UP2.Even considering the low number of initiatives evaluated, twooutcomes drew the attention, regarding feedback and semantics.Commenting on data is allowed only in three initiatives, and thesame number (but not the same ones) offers a data request form.No reporting issues mechanisms were found, revealing a strong ab-sence of feedback possibilities (CA16).The lack of semantics support (only three offered it - CA12), orlinkable data (again, only three had it - CA10) also may point that Available at http://community.openspending.org/map-of-spending-projects/ . UP3 is still far from reality. Ten initiatives use categories for thedatasets, which at least facilitate some form of comparisons.Regarding the use perspectives, we can state:
UP1 - Transparency:
The main requirements for this use perspec-tive – data on transaction level, machine readable formats, aggrega-tion levels – were accomplished by most of the open budget initia-tives. However, much work is still to be done concerning the feed-back handling. We can say that, for most of the analysed cases,stakeholders interested in auditing government and in translatingdata into more accessible formats are partially satisfied.
UP2 - Participation:
The requirements set for this use perspectiveenforced human readable formats, that allows citizens without deepbudget knowledge to understand data and to participate in discus-sions. Slightly more than half of the initiatives present graphics,which can help quick insights over data. Only three initiatives offermaps to visualize budget data, what is coherent to the low numberof initiatives that include the location dimension (eight). Anotheraspect emphasized in this use perspective was the usability and de-sign. Considering the already mentioned limitations on assessingthis issue, we noticed that ten initiatives use standard open sourcesoftware tools. Although this is not the most relevant factor regard-ing usability, the use of standard tools favours users dealing withseveral open budget initiatives. Moreover, as open source tools, themore initiatives using these tools, the better they can be developed.
UP3 - Policy Making:
The main requirements in this perspectivewere the use of common classifications, vocabularies and ontolo-gies, and the possibility of linking data with other databases. Asalready mentioned, semantics support was mostly absent. Com-parison tools, also important in this case, were found only in threeof the initiatives. Thus, this use perspective is still far from be-ing realised in most of the analysed initiatives. All these indicatethat working on standard terminologies and common conceptual-izations as suggested by OpenSpending [15] is highly desirable.
8. CONCLUSIONS
In this paper, we presented a model to analyse open budget ini-tiatives, including dimensions and assessable characterization at-tributes. The model covers, at this stage, General Aspects, Publish-ing and Consumption dimensions of these initiatives. Initial testingof the model and analysis of 23 open budget initiatives revealedthat attention has to be given on feedback, semantics support andlinking possibilities.Future research includes adding the Context dimensions in themodel, and developing characterization attributes for it. We intendalso to extend the assessment to other initiatives, as well as to con-duct a more detailed analysis of some of their characteristics. Asthe results pointed a very weak performance on the Feedback di-mension, we aim to further explore the Consumption part, in orderto propose solutions that can contribute on this issue. The Usabil-ity dimension also needs more consistent characteristic attributes.The use of standard vocabularies or ontologies and linkable dataformats also deserves attention.Regarding the use perspectives, we conclude that transparencyrequisites were mostly accomplished by the analysed initiatives.Participation, in turn, is still not heavily supported, while tools forcomparing budget data by policy makers are still far from reality.It has to be noticed that materializing transparency is far morecomplex than just publishing budget data through software tools.Several political issues are related to data publishing, as well asdeep data literacy questions are involved in the usage of open bud-et initiatives. Gurstein [8] alerts about the emergence of a "datadivide", a parallel concept to the digital divide, distinguishing peo-ple "who have access to data which could have significance in theirdaily lives and those who don’t". Thus, transparency policies cannot be implemented without actions to foster digital inclusion, andwhy not, "data inclusion".Software tools are, although indispensable, just part of the pro-cess. With this research effort, we aim to enrich the existing knowl-edge on open budget, and help to set the basis for developing toolsand procedures that, together with other actions, may result in realbenefits of fiscal transparency to the society.
9. ACKNOWLEDGMENTS
A. Tygel is supported by CAPES/PDSE grant99999.008268/2014-02. M.L.M.Campos is partially supportedby CNPq–Brazil. Also, this work is supported in part by theEuropean Commission under the Seventh Framework ProgramFP7/2007-2013 (
LinDA – GA ICT-610565).
10. REFERENCES [1] U. Atz, T. Heath, and J. Fawcet. Benchmarking Open DataAutomatically. Technical report, Open Data Institute,London, 2015.[2] N. Beghin and C. Zigoni. Measuring Open Data’s Impact ofBrazilian National and Sub-National Budget TransparencyWebsites and its Impacts on Peoples’s rights. Technicalreport, INESC, Brasília, 2014.[3] R. Caplan, T. Davies, A. Wadud, S. Verhulst, J. M. Alonso,and H. Farhan. Towards common methods for assessing opendata: workshop report & draft framework. Technical report,Workshop on common methods for assessing open data,New York, 2014.[4] L. Chambers, V. Dimitrova, and R. Pollock. Technology forTransparent and Accountable Public Finance. Technicalreport, Open Knowledge Foundation, Cambridge, 2012.[5] T. Davies. Supporting open data use through activeengagement. In
Proceedings of the W3C Using Open DataWorkshop , pages 1–5, Brussels, 2012.[6] European Comission. Eu Anti-Corruption Report. Technicalreport, European Comission, Brussels, 2014.[7] K. Granickas. Understanding the impact of releasing andre-using open. Technical report, European Public SectorInformation Platform, 2013.[8] M. B. Gurstein. Open data: Empowering the empowered oreffective data use for everyone?
First Monday , 16(2):1–7,2011.[9] T. Heath and Christian Bizer.
Linked Data: Evolving the Webinto a Global Data Space . Morgan & Claypool, SynthesisLectures on the Semantic Web: Theory and Technology, 1stedition, 2011.[10] N. Huijboom and T. V. D. Broek. Open data: an internationalcomparison of strategies.
European Journal of ePractice ,1(12):1–13, 2011.[11] International Budget Partnership. Open Budget Survey 2012.Technical report, International Budget Partnership, 2012.[12] International Monetary Fund.
The Special DataDissemination Standard . IMF Publication Services,Washington, 2013.[13] J. Manyika, M. Chui, P. Groves, D. Farrell, S. Van Kuiken,and E. A. Doshi. Open Data: Unlocking Innovation andPerformance with Liquid Information. Technical ReportOctober, McKinsey, 2013. [14] M. Martin and J. Lehmann. LinkedSpending: OpenSpendingbecomes Linked Open Data.
Semantic Web Journal ,1(1):1–5, 2013.[15] OpenSpending. Budget Data Package. Technical report,Open Knowledge Foundation, 2014.[16] OpenSpending. Spending Data Handbook. Technical report,Open Knowledge Foundation, 2014.[17] T. Peixoto. Beyond Theory: e-Participatory Budgeting andits Promises for eParticipation.
European Journal ofePractice , 1(March):1–9, 2009.[18] D. Reynolds. Guide to the Payments Ontology. Technicalreport, Epimorphics Ltd, 2010.[19] Royal Society. G8 Science Ministers Statement. Technicalreport, Foreign & Commonwealth Office, London, 2013.[20] M. T. Santos, W. Cruz, and M. S. Fonseca. Uma Ontologiadas Classificações da Despesa do Orçamento Federal. In
Proceedings of Ontobras , pages 266–271, Recife, 2012.[21] Y. Sintomer and C. Herzberg. Participatory Budgeting inEurope: Potentials and Challenges.
International Journal ofUrban and Regional Research , 32(1):164–178, 2008.[22] B. Ubaldi. Towards Empirical Analysis of Open GovernmentData Initiatives. Technical Report 22, OECD Working Paperson Public Governance, 2013.[23] M. Vafopoulos, M. Meimaris, I. Anagnostopoulos,A. Papantoniou, and I. Xidi. Public spending as LOD : thecase of Greece.
Semantic Web Journal , 6(2):155–164, 2015.[24] M. Vafopoulos, M. Meimaris, J. M. A. Rodríguez, I. Xidias,M. Klonaras, and G. Vafeiadis. Insights in global publicspending. In
Proceedings of the 9th International Conferenceon Semantic Systems - I-SEMANTICS ’13 , pages 135–139,2013.[25] N. Veljkovi´c, S. Bogdanovi´c-Dini´c, and L. Stoimenov.Benchmarking open government: An open data perspective.
Government Information Quarterly , 31(1):278–290, 2014.[26] V. Vlasov and O. Parkhimovich. Development of the OpenBudget Format. In
Proceedings of the 16th conference offruct association association , pages 129–136, Oulu, 2014.[27] S. T. Walker. Budget mapping: Increasing citizenunderstanding of government via interactive design. In
Proceedings of the Annual Hawaii International Conferenceon System Sciences , pages 1–9, 2010.[28] B. Worthy. David Cameron’s Transparency Revolution? TheImpact of Open Data in the UK. Technical report, Universityof London - Birkbeck College, London, 2013.[29] A. Zuiderwijk and M. Janssen. Open data policies, theirimplementation and impact: A framework for comparison.
Government Information Quarterly , 31(1):17–29, 2014.[30] A. Zuiderwijk and M. Janssen. The Negative Effects of OpenGovernment Data - Investigating the Dark Side of OpenData. In