Hit by the Data: a visual data analysis regarding the effects of traffic public policies
Luana Müller, Camila Moser, Guilherme Paris, Lucas Freitas, Mayara Oliveira, Wagner Signoretti, Isabel Harb Manssour, Milene Selbach Silveira
HHit by the Data: a visual data analysis regarding the effects oftraffic public policies
Luana Müller; Camila Moser; Guilherme Paris;Lucas Freitas; Mayara Oliveira; Wagner Signoretti;Isabel Manssour; Milene Silveira
School of TechnologyPontifical Catholic University of Rio Grande do SulPorto Alegre, Brazil [email protected]; [email protected]; [email protected]@acad.pucrs.br; [email protected] [email protected]@pucrs.br; [email protected]
Abstract
The availability of Open Government Data (OGD) provides means for citizens to understand and follow governmentalpolicies and decisions, showing evidence of how the latter have contributed to both the place they live in and their lives. In sucha scenario, one of the proposals is the use of visualizations to support the process of data analysis and interpretation. Herein,we present the use of three different visualization tools - a commercial one and two academic ones – applied to two specificBrazilian cases: the implementation of the Drink Driving Law and the construction of a new overpass in an important cityavenue. Our focus was on the analysis of how visualization could help in the identification of the effects of such traffic publicpolicies. As our main contributions, we present details on the effects of the observed policies, as well as new cases showinghow visualization tools can assist users to interpret OGD. K eywords Information Visualization · Open Government Data · Case Study · Traffic
1. Introduction
The increasing availability of Open Government Data (OGD), allied to the lack of standardization in the way they are presented, leads togreat difficulty when it comes to their use by citizens in general. Every citizen who makes professional use of these kinds of data needsmeans to analyze and get insight about them. In this scenario, Graves [1] advocates the use of “visualizations as a medium to consume, shareand interact with data.” According to Sivarajah et al. [2], “the use of visualization techniques makes the data analysis task easier for humaninterpretation and provides support to decision making”.We can find several investigations exploring the use of visualization to support OGD analysis [3, 4]. In the Brazilian context, for instance,the work from De Mendonça, Maciel and Filho [5] presents a case study in which a map is created to visualize information regarding theinfestation of Aedes aegypti mosquito in the city of Cuiabá. Still in the Brazilian context, Craveiro and Martano [6] reported that, despite thedata availability on government portals, these data have not always been understood by the broader public. Thus, the authors presented a toolcalled
Cuidando do Meu Bairro (Looking After My Neighborhood), used in São Paulo city to promote a better visualization of governmentspending, and, therefore, to foster citizen engagement.Despite different attempts to create visualization tools that would help in better understanding OGD, the systematic mapping on OGDvisualization presented by Eberhard and Silveira [7] shows us that one of the major challenges in this field refers to the skills required to usesuch tools [8, 3, 9].In this context, Méndez, Hinrichs and Nacenta [10] performed a study in which they compared people’s visualization processes using twovisualization tools (Tableau Desktop and iVoLVER) and discussed how different approaches can influence the visualization process, thedecisions on the visualization design, the sense of control and authorship, and the enthusiasm to explore alternative designs.In the research herein presented, we also chose to investigate different visualization tools: a commercial one and two academic ones. Inour case, the investigation was based on the analysis of OGD visualizations applied to two specific Brazilian cases: the implementation of theDrink Driving Law and the construction of a new overpass. We analyzed the data related to the year before the events had taken place (thelaw implementation and the overpass construction) and 2016, the last year we have associated with the available OGD. Our focus was theanalysis of how visualization could assist in the identification of the effects of such traffic public policies.The remainder of this paper is organized as follows. Section 2 describes our research methodology, including the used dataset, theselected case studies, and the used visualization tools. The results of our analysis with our findings are presented in Section 3, to be furtherdiscussed in Section 4. Finally, we end this paper with our conclusions and future work directions in Section 5. a r X i v : . [ c s . H C ] F e b . Research Methodology The research methodology followed in this work has been split into five steps: I) literature review; II) selection of a dataset; III) identificationof case studies; IV) selection of three different tools; V) performance of visual analysis and finding identification. Moreover, the followingsubsections present an explanation on the used data, a description of the selected case studies, and, lastly, a presentation of the visualizationtools used in our visual analysis. Our findings will be discussed in depth in the next section.
For our visual analysis, we used OGD related to traffic accidents that had occurred in a city in southern Brazil, called Porto Alegre. An opengovernment data portal , maintained by the City Hall, records OGD about traffic accidents that occurred between 2000 and 2016, in csvformat. These records generate large multivariate datasets, with several attributes associated with a specific location (latitude and longitude),such as accident type, time, date, number of injuries, fatalities, vehicle category, and weather conditions.However, as these data have different format depending on the year, we chose the data referring to the year of 2016 to be used as a modelfor the standardization of other years. We also removed some attributes that were unnecessary for our analysis, because there were either datawe did not consider as useful (e.g. consortium), or data with the same information, or even because some data were blank in most records.Besides that, we had to deal with missing data and non-standardized data format, which was the case of the date. For these standardizations,we have used OpenRefine . We chose two case studies to analyze how visualization could help in the identification of the effects of traffic public policies. Our main goalis to identify how laws and street works have helped in reducing traffic accidents in the city.The first case study is related to the Drink Driving Law , which was enacted on June 19, 2018. This law prohibits the consumption ofalcohol by drivers, imposing more severe penalties for those who drive under the influence of alcohol. Its main purpose is to avoid accidentsthat may occur due to the carelessness of drivers with impaired operating abilities. Thus, we decided to evaluate the total number of accidentsthat happen on weekends (from Friday to Sunday), when people in Brazil usually go to parties, bars and restaurants, that is, places wherethere is typically more consumption of alcoholic beverages. It was mainly on weekends that "Balada Segura" also began to be held in 2012.The term, freely translated into English as "Safe Clubbing", consists of a surveillance operation carried out by traffic organs, Military andCivil Police, which aims at preventing alcohol-related traffic criminal offences in places and at times with a higher incidence of accidents.Considering these dates, we opted for analyzing the total number of accidents in 2007, before the Drink Driving Law, and in 2016, afterpeople had already been aware of the zero tolerance laws and drunk driving inspections.The construction of an overpass on an important avenue of Porto Alegre, Brazil, was the second chosen case study. The idea was toevaluate how much the construction of the overpass has influenced the reduction of traffic accidents once it was a very busy crossroad. Sincethe overpass construction began in August 2012, being only launched in June 2015, we chose to analyze the total number of accidents in 2011and 2016, before and after its construction. Three tools were chosen to our proposed analysis: a commercial and two academic ones (developed in our research group). The nextsubsections describe them in details.
It is a commercial software for data analysis and visualization, both for individual and group analysis . It comprises several tools requiredto generate visualizations, from receiving data files of various formats to the generation of different charts and dashboards. The providedinterface allows a quick selection and optimization of data, as well as suggesting chart models for visualization, such as treemaps, bars,bubble, pie, line, scatter, among others. This tool presents an interactive visualization design for visual analysis of geospatial multivariate data that facilitates data analysis andknowledge discovery. It allows the representation of several attributes on the map with dynamically linked charts and an interactive approachbased on the brushing and linking technique integrated with coordinated multiple views that can support visual analysis. Thus, a set ofinteractive visualizations are combined to stimulate an active interaction of the end user. It offers a way to interactively apply different filtersand analyze several attributes, updating all the related visual representations, including the map with clustered charts. http://datapoa.com.br/dataset/acidentes-de-transito openrefine.org Law number 11.705 - https://bit.ly/2v2sXBy https://baladasegura.rs.gov.br/inicial .3.3. Traffic Accidents Analyzer (TrAcc) This tool provides different visualization techniques specifically for the OGD of traffic accidents. It provides an animated heat map based ona timeline, as well as three charts that help visualize the information contained in the data: a Historical Bar Chart, which shows the number ofaccidents within a time range (0h to 23h); a Pie Chart, showing the regions of the city (north, south, east, and center) where most accidentsoccur; and a Horizontal Bar Chart, showing the types of accidents and their total number. Several filter options are also available, such as thetype of vehicle (car, motorcycle, truck, etc.), time range (morning, afternoon, night, dawn), weather condition (good, rainy, cloudy), and daysof the week.
3. Findings
As previously mentioned, we chose two scenarios to support our case studies. A common step to both case studies was loading the data intothe tools, which enabled us, after the import completion, to perform the visualizations.Regarding the steps described in Subsection 2.1, it was fairly simple to load the data into GeoCharts, since it allows the use of any data inthe csv format. Tableau also supports loading csv data. However, some problems happened when the files were opened, not creating thecolumns correctly, for instance. Nonetheless, these problems were easily solved by just converting the files to xls format. One situationidentified in both tools, GeoCharts and Tableau, was related to the latitude and longitude, which, in some cases, were not readable by them.The solution to this issue was just to change the decimal separator format. TrAcc did not present any problems, once the data were alreadyembedded in it.The subsections below describe the findings grouped by each case study, also presenting the visualizations which helped us throughoutthe analysis.
In our first case study on the effects of the Drink Driving Law, we started analyzing the Geocharts tool. To reach our goal, it was necessary toapply three filters while using GeoCharts: year, day of the week (Friday, Saturday, and Sunday), and shift. After that, GeoCharts showed amap and different associated charts presenting the numbers of traffic accidents in 2007 and 2016 (Figure 1 ). In order to filter data by aspecific year, users must click and select the year they want in the year chart. We highlight that, regarding these charts, users could pre-selectthe variables they wanted to visualize in charts along the map (five charts at most). At any time, in case users wanted to change the charts,they only needed to select new variables and generate a new visualization.In general terms, the results showed that in 2007, before the implementation of the law, 5076 accidents had occurred during the day and3180 at night, whereas in the year of 2016 the number of accidents had reduced to 3083 occurrences during the day and 1345 during the night.In GeoCharts, the number of occurrences is numerically presented next to the map at the top, in the same figure area as the charts. Figure 1.
General overview regarding traffic accidents in 2007, by GeoCharts
By using Tableau and TrAcc to achieve the same results, it was necessary to apply the same three filters. Regarding these filters, TrAccoffers six filtering options: by vehicle type, day shift, weather condition, day of the week, address, and year. Tableau, on the other hand,offers the possibility of choosing any variable from the dataset to be used as a filter.Considering forms of visualization presentation, such as GeoCharts, Tableau allows us to jointly visualize the data from different timeperiods. However, it only presents one chart at a time, and users must switch among different types of charts (for instance, in Figure 2 wepresent the bar chart, chosen to present a simple overview regarding the general numbers from both years).Regarding the visualization provided by TrAcc, it shows a heat map in which it is possible to visualize the data from the selected years(Figure 3). As compared to GeoCharts, TrAcc also provides different charts that help bring signification to the data. Differently from theother tools, in which it is simple to identify the totals, the information in TrAcc is presented in a partitioned way through the charts. To verify The used systems were developed in Portuguese. We provide English subtitles in the figures. igure 2.
General overview regarding traffic accidents in 2007 and 2016, by Tableau the totals, it is necessary to manually calculate the information by summing up the data presented in the chart
Type of Accident , in which thenumbers are clearly presented.
Figure 3.
General overview regarding traffic accidents in 2007, by TrAcc
In regard to the second case study, we analyzed the impact of the construction of an overpass in an important city avenue in southern Brazil.By using GeoCharts, it was required to filter by year, and to zoom in and find the desired location on the map. After finding the location,users need to analyze and check the number presented on the map. The number we found is related to accidents which had occurred aroundthe crossroad area, not exactly in the intersection of the two avenues. We identified 53 accidents in 2011 on the crossroad where the overpassis located nowadays (Figure 4 left). Following the same steps, we verified that the number of accidents in the same region was reduced to 14in 2016.By using Tableau, we applied a filter regarding the years, and also filters regarding the required address (in this case, we informed twoaddresses, representing the crossroad). Thus, differently from GeoCharts, the tool provided us with a chart visualization about the accidentsthat had taken place exactly on the intersection of the avenues. By that, we found that, in the year of 2011, 14 accidents had occurred in thiscrossroad, and, in 2016, after the construction of the overpass, the number reduced to only one accident (Figure 5).Similar to Tableau, TrAcc provides a filter in which it is possible to inform the year (as presented in Figure 3), and filters users canapply to determine the addresses they want to analyze. The results presented by TrAcc in the heat map are not influenced by the addresses.However, the associated charts change, presenting the numbers related only to the informed addresses. igure 4.
Visualization of accidents on the crossroad in 2011 and 2016, by GeoCharts
Figure 5.
Visualization of accidents in 2011 and 2016 on the crossroad, by Tableau
4. Discussion
The tools we investigated present different visualizations, helping us analyze the available data under different perspectives.In the next subsections, we firstly discuss aspects related to specific possibilities of use of the analyzed tools, followed by how thesepossibilities helped us during our case study analysis.
Regarding data import , TrAcc displays the data related to accidents from Porto Alegre city already loaded in the system, which prevents theuser from having to deal with details regarding pre-processing and data import. However, Tableau and GeoCharts accept any type of datasetand can be used for different purposes. Also, after the data import, it is very simple for the user to change the charts that are being presented(GeoCharts allows the presentation of five charts at the same time, as well as allowing the user to choose five visulization variables in thesecharts), while in TrAcc the three charts and six filtering variables provided by the tool (besides the heat map) cannot be changed.In regard to filtering precision , in the overpass case study, it calls our attention that GeoCharts, differently from Tableau and TrAcc, didnot allow us to inform the addresses we wanted to visualize and, consequently, did not allow us to precisely check for accidents in a particularlocation. On the other hand, by using the zoom, GeoCharts provides a macro view, as presented in Figure 4 (right), that allows us to have abroader perspective of the analysis. By using the other tools, the users would need to know the exact description of surrounding addresses inorder to visualize them, although a single accident occurred at the exact crossroad address (as presented by the other tools), there were 14accidents in the area.Also, GeoCharts allows us to click in each one of the bullets presented in the image, providing detailed information about the accident(such as the kind of vehicle, weather conditions, day of the week, etc.). TrAcc, in turn, also allows zooming in and out the map to approach alocation (Figure 6). However, it does not allow to get more information regarding the accidents. Despite the lack of information, the userhas a fast overview of the accidents that had happened in that area through the heat visualization, whereas only numbers are presented inGeoCharts.
Figure 6.
Visualization of accidents in 2016 around the crossroad area, by TrAcc
Regarding the possibility of comparing different periods in the same chart, Tableau allows it to be done, while this is not possible bysing the other tools. In both GeoCharts and TrAcc the user needs to check the periods one at the time. The user can select different years tocreate the visualization, but the results are presented based on a concatenation of the data from the selected years. As presented in Figure 1,there is a chart (located in the figure’s bottom right corner) with information regarding the years 2007 and 2016, and the user must clickon the period he wants to visualize with more details. The presentation of the results in Tableau makes the visualization easier, especiallycomparing different periods, as presented in Figure 5, but it does not offer more details about those years. If the user wants to explore suchinformation, he needs to generate a new visualization, choosing the variables he wants to visualize.
In regard to how these different visualizations assist the interpretation of data, we would like to discuss and focus on each case study’s needs.The analysis of the Drink Driving Law effects requires an overview regarding the situation in the whole city. To achieve a simple andstraightforward analysis, the visualization from Tableau may be enough, once it shows the numbers and charts related to it. Since we aredealing with information from locations all over the city, a visualization with many details could generate a negative effect on the users, sinceit would take a lot of effort to interpret this view. However, in need of a detailed perspective, both GeoCharts and TrAcc allow the usersto identify the areas from the city in which the new law impacted more, and this insight is even simpler by the visualization through theheat map offered by TrAcc. Furthermore, the charts from GeoCharts allow fast filtering and the visualization of numbers, providing a quickresponse and helping users while they are interacting and interpreting these data.About the second case study, we wanted to visualize a situation encompassing a very specific part of the city (a crossroad) in order tounderstand if the construction of an overpass had effectively helped in traffic. Unlike the first case study, in this one details can improveinterpretation and are easy to be visualized (since our analysis involved a very small area of the city). Regarding the precise view required inthis case, both Tableau and TrAcc allow the users to inform the address they want to visualize (in this case, two addresses corresponding tothe intersection of two avenues). However, this resource can lead users to some false interpretation, once the accidents that had happenedaround the crossroad are not presented to them. GeoCharts, on the other hand, does not allow users to inform the address to be visualized;therefore, users need to zoom in to find the location they want (a task that can become costly in cases where the city is large and the user is notfamiliar with the city map). Thus, the possibility to inform the address must be available to users; however, it cannot limit the visualizationwithout presenting the areas around it. Also, to provide a richer interpretation, once the scenario is limited to a small part of the city, detailsregarding the accidents must support the user, like those presented by GeoCharts and TrAcc.
5. Conclusion and Future Work
Nowadays, it is easy to find Open Government Data (OGD) available on governmental portals, but it is not so easy to employ them withoutsome help. Similar to other researchers in the field, we advocate that the use of visualization techniques makes it easier for citizens to analyzeand interpret these kinds of data.In the research herein presented, we used three different visualization tools to analyze the OGD referring to two Brazilian cases: theimplementation of the Drink Driving Law in 2007 and the construction of a new overpass in an important crossroad, which started in 2011and was concluded in 2016. We focused on analyzing how visualizations could support the identification of the effects of the studied trafficpublic policies, and we presented a twofold contribution: first, the details on the effects of the observed policies, and, second, two more casesshowing how visualization tools could help users to interpret OGD.As for our next research steps, we intend to analyze the same cases with professionals who closely work with traffic and traffic policies togather their opinion about the tools’ usage and their possibilities. We also intend, as future work, to analyze new case studies, enabling us toexplore other perspectives of those given tools.
References [1] Alvaro Graves and James Hendler. Visualization tools for open government data. In
Proceedings of the 14th Annual InternationalConference on Digital Government Research , dg.o ’13, pages 136–145, New York, NY, USA, 2013. ACM.[2] Uthayasankar Sivarajah, Vishanth Weerakkody, Paul Waller, Habin Lee, Zahir Irani, Youngseok Choi, R. Morgan, and Yuri Glikman.The role of e-participation and open data in evidence-based policy decision making in local government.
J. Org. Computing and E.Commerce , 26:64–79, 2016.[3] Alvaro Graves and Javier Bustos-Jiménez. Co-creating visual overviews for open government data. In
Proceedings of the 16th AnnualInternational Conference on Digital Government Research , dg.o ’15, pages 37–42, New York, NY, USA, 2015. ACM.[4] Victor Santos, Pedro Camara, Flavia Bernardini, Jose Viterbo, and Douglas Jorge. A framework for constructing open data mapvisualizations. In
Proceedings of the XIV Brazilian Symposium on Information Systems , SBSI’18, pages 12:1–12:7, New York, NY,USA, 2018. ACM.[5] Patricia Graziely Antunes de Mendonça, Cristiano Maciel, and José Viterbo Filho. Visualizing aedes aegypti infestation in urban areas:A case study on open government data mashups. In
Proceedings of the 15th Annual International Conference on Digital GovernmentResearch , dg.o ’14, pages 186–191, New York, NY, USA, 2014. ACM.6] Gisele S. Craveiro and Andrés M. R. Martano. Caring for my neighborhood: A platform for public oversight. In Fernando Koch, FelipeMeneguzzi, and Kiran Lakkaraju, editors,
Agent Technology for Intelligent Mobile Services and Smart Societies , pages 117–126, Berlin,Heidelberg, 2015. Springer Berlin Heidelberg.[7] André Eberhardt and Milene Selbach Silveira. Show me the data!: A systematic mapping on open government data visualization. In
Proceedings of the 19th Annual International Conference on Digital Government Research: Governance in the Data Age , dg.o ’18,pages 33:1–33:10, New York, NY, USA, 2018. ACM.[8] J. Brugger, M. Fraefel, R. Riedl, H. Fehr, D. Schöeneck, and C. S. Weissbrod. Current barriers to open government data use andvisualization by political intermediaries. In , pages 219–229, May2016.[9] D. Pirozzi and V. Scarano. Support citizens in visualising open data. In , pages 271–276, July 2016.[10] Gonzalo Gabriel Méndez, Uta Hinrichs, and Miguel A. Nacenta. Bottom-up vs. top-down: Trade-offs in efficiency, understanding,freedom and creativity with infovis tools. In