Putting Down Roots: A Graphical Exploration of Community Attachment
PPutting Down Roots: A Graphical Exploration of CommunityAttachment
Andee Kaplan Eric Hare
Abstract
In this paper, we explore the relationships that individuals have with their communities. This work wasprepared as part of the ASA Data Expo ‘13 sponsored by the Graphics Section and the Computing Section, usingdata provided by the Knight Foundation Soul of the Community survey. The Knight Foundation in cooperationwith Gallup surveyed 43,000 people over three years in 26 communities across the United States with the intentionof understanding the association between community attributes and the degree of attachment people feel towardstheir community. These include the different facets of both urban and rural communities, the impact of qualityeducation, and the trend in the perceived economic conditions of a community over time. The goal of our work isto facilitate understanding of why people feel attachment to their communities through the use of an interactiveand web-based visualization. We will explain the development and use of web-based interactive graphics, includingan overview of the R package
Shiny and the JavaScript library D3 , focusing on the choices made in producingthe visualizations and technical aspects of how they were created. Then we describe the stories about communityattachment that unfolded from our analysis. a r X i v : . [ s t a t . O T ] A ug Introduction
This work was part of the American Statistical Association’s Data Exposition 2013. The dataset came from theKnight Foundation’s ‘Soul of the Community’ project (The Knight Foundation, 2014). For this project, the KnightFoundation in conjunction with Gallup collected surveys from 43,000 people in the years 2008, 2009, and 2010, in26 U.S. communities. A map of the 26 communities and the geographical regions to which we assigned them isdisplayed in Figure 1. The regions are the Great Plains, the West, the Deep South, the Southeast, and the Rust Belt.These regions were not included as part of the dataset, but rather were our own constructs created by a graphicalexploration of the locations of each community. We roughly based these regions on the US Census regions anddivisions of the United States map (The United States Census Bureau, 2014). We did not strictly adhere to stateboundaries, but rather looked at the proximity of individual cities to the surrounding communities. regions
Deep South Great Plains Rust Belt Southeast West
Figure 1: Overview of the 26 Knight Foundation communities in which surveys were conducted. Assigned geograph-ical regions are indicated by both color and shape. The Knight Foundation in conjunction with Gallup collectedsurveys from 43,000 people in the years 2008, 2009, and 2010 with the goal of understanding the association betweencommunity attributes and the degree of attachment people feel towards their community.The survey contained raw responses as well as derived metrics that we used to gain insight into what makes acommunity thrive. The metrics we used can be found in Table 1. Each metric was calculated as a simple average ofthe response to anywhere from 2 to 6 questions. The metrics gave insight into how residents felt their communityrated on various dimensions. For example, Education covered both public education as well as higher education inthe community, while Social Offerings dealt with both nightlife as well as neighborliness. Community Attachmentcombined questions on how proud residents were to live in their community, if they would have recommended thecommunity as a place to live, and how they predicted the community would be in five years. A score of 5 indicatedthe most positive response was given on all questions that this metric is derived from. We used the 10 metricsin Table 1 to find relationships within types of communities to Community Attachment, as well as to explore any1otable regional differences.The goal of our work was to facilitate understanding of why people feel attachment to their communities through theuse of an interactive and web-based visualization. Specifically, we took the point of view of a community planner,either from one of the communities in the study or from a community in the same region or a similar urbanicity.By putting the user in the driver seat of their own experience, we allow the user to apply the conclusions of theirinteraction to their own situation. The purpose of interaction is to discover what the data has to tell the world.We did not attempt to draw statistical conclusions about the data, thus we did not use the survey weights providedin the data in our analysis. Because the communities are sparsely and unevenly distributed throughout the UnitedStates, we felt an exploratory approach would help us to sift through the data and discover its patterns.Many of the discoveries in the data were readily apparent, while others required some more investigation. Inthe words of John Tukey, “Exploratory data analysis is detective work - numerical detective work - or countingdetective work - or graphical detective work .” (Tukey, 1977) Dynamic, interactive visualizations can empower peopleto explore the data for themselves as well as encourage engagement with the data in a way that static visualizationscannot. Additionally, linking multiple visualizations shows different aspects of a complex data set and helps highlightrelationships. By allowing actions in one plot to affect elements in other plots, comparisons are made easy for theuser without requiring much memorization. This aids in pattern finding by reducing cognitive load. In additionto wanting it to be easy to explore the data, we wanted the tool to be easy to use. A web-based application isplatform-independent and allows the user to employ the tool without any software to download. Additionally, bybuilding an application that works on all modern browsers and operating systems, there are no limitations on whocan use the tool. Finally, automatic feature additions and bug fixes can be completed transparently to the user.To fully engage the user with our work and facilitate the emergence of interesting or descriptive patterns we cre-ated CommuniD3 (available at http://andeek.shinyapps.io/CommuniD3 ), an interactive web-based tool that reliesheavily on the idea of linked plots. A linked plot will adapt to changes made in other plots within the collection,creating a dynamic and interactive set of graphics. Different visualizations illustrate different aspects of the data,and linking helps regain the multidimensional aspect of the data (Buja et al., 1991). In the following section wediscuss the structure and tools used to build CommuniD3. Section 3 highlights an application of CommuniD3 infinding interesting stories across the Unitied States.
To visually understand attachment, we created an interactive web-based application. The construction and designof CommuniD3 are detailed in the following sections. First the user interface is described and then we discuss thewide range of technology we used to construct the tool.CommuniD3 is comprised of three parts, (1) Side panel, (2) Map Panel, and (3) Plot panel, as seen in Figure 2. Asthe user interacts with each piece, the other portions of the interface update to reflect the interaction. In this waywe have built an interactive graphical framework, rather than an animation.2 etricsCommunity Attachment I am proud to say I live in [Community].[Community] is the perfect place for people like me.Taking everything into account, how satisfied are you with [Community] as a placeto live?How likely are you to recommend [Community] to a friend or associate as a placeto live?And thinking about five years from now, how do you think [Community] will beas a place to live compared to today?Social Offerings Having a vibrant nightlife with restaurants, clubs, bars, etc.Being a good place to meet people and make friendsHow much people in [Community] care about each otherOpenness Young, talented college graduates looking to enter the job marketImmigrants from other countriesFamilies with young childrenGay and lesbian peopleSenior citizensAesthetics The availability of outdoor parks, playgrounds, and trailsThe beauty or physical settingEducation The overall quality of public schools in your communityThe overall quality of the colleges and universitiesBasic Services The highway and freeway systemThe availability of affordable housingThe availability and accessibility of quality healthcareLeadership The leadership of the elected officials in your cityThe leaders in my community represent my interestsEconomy The availability of job opportunitiesHow would you rate economic conditions in [Community] today?Right now, do you think that economic conditions in [Community] as a whole aregetting better or getting worse?How likely are you to agree that your job provides you with the income needed tosupport your family?Now is a good time to find a job in my areaHow satisfied are you with your job, that is, the work you do?Safety How would you rate how safe you feel walking alone at night within a mile of yourhome?How would you rate the level of crime in your community?Social Capital How many formal or informal groups or clubs do you belong to, in your area, thatmeet at least monthly?How many of your close friends live in your community?How much of your family lives in this area?How often do you talk to or visit with your immediate neighbors?Civic Involvement Performed local volunteer work for any organization or groupAttended a local public meeting in which local issues were discussedVoted in the local electionWorked with other residents to make change in the local community
Table 1: The metrics used from the Knight Foundation Soul of the Community survey (Foundation, 2010). Allmetrics are on a 1-3 scale except for Community Attachment, which is on a scale of 1-5. A higher score on any metricindicates the respondent replied positively to the associated questions.3igure 2: The components that make up CommuniD3, (1) Side panel, (2) Map panel, and (3) Plot panel.(1) The side panel houses two features. The first is the ability to investigate the data for individual years versusaggregated across all three years. In this way we are able to explore attitude changes across the three yearssurveyed as well as overarching trends in the regions and urbanicities. The second is a colorblind friendly optionthat uses a red-blue color scheme on the map rather than red-green to accommodate more users.(2) The map panel is the central piece of the application. The 26 communities surveyed are plotted geographicallyon a map of the United States. The size of each dot represents the number of surveys for the chosen time period,while the color shading of each dot corresponds to the average value for each community in the chosen time periodfor the metric selected. The panel on the right allows the user to change the metric displayed. Additionally, eachcommunity is clickable. On click, basic information about that community is displayed below the map panel andthe plot panel is updated to reflect the community that is clicked. It is our goal for a community planner to beable to start with the map panel and find a community that was surveyed that corresponds to the communitythey are interested about, or one that is nearby, as a means to delve into the driving factors of communityattachment.(3) The plot panel is a set of three linked plots that detail three aspects of the dataset. The first plot is a bargraph showing the average value of the metric selected for the year range selected for the community selected aswell as for its urbanicity and its region. For example, once Detroit, MI is the community in focus, the urbanicityis “Very high urbanicity-very large population” and the region is the Rust Belt. The fourth bar in the chartrepresents the aggregation of all the communities serves as a reference. While the bar chart is a plot thatshows surface information, its true purpose here is to control the information displayed in the other two plots.As the user clicks on the bars, the other two plots display information pertaining to the level selected (eitherregion, urbanicity, or the whole dataset). The middle plot is an ordered dot plot displaying the average valuefor the metric selected for all communities with the level selected in the bar chart highlighted. See Figure 3for an example of the different levels of highlighting. The third plot is a plot of pairwise correlations between4ach metric and community attachment for the level of aggregation (year and community/region/urbanicity).The small grey dots serve as a reference level in the background that displays the correlation for every surveyaggregated to ease comparison for the user. The three plots are linked in such a way that selection throughclicking in one plot will affect all three plots and potentially the map panel. In this manner, the user can trulydrive their experience and take ownership of their analysis.Figure 3: Examples of the different types of highlighting available in the ordered dot plot from the plot panel.The highlighting corresponds to (1) community selection, (2) urbanicity selection, (3) region selection, and (4) allcommunity selection. In this example, Detroit, MI has been selected to display the values of community attachmentfor all three years, 2008-2010. 5 .1 The Shoulders of Giants
We incorporated several pioneering technologies in the creation of our application that allowed us to find insightsin the dataset.
Shiny (RStudio Inc. 2013) is an R package created by RStudio that enables R users to create aninteractive web application that utilizes R as the background engine. In CommuniD3, Shiny is used as the frameworkupon which the application sits. D3 (Bostock, Ogievetsky, and Heer, 2011) stands for “Data Driven Documents” andis a JavaScript library developed and maintained by Mike Bostock with the purpose of visualizing and interactingwith data in a web-based interface. We used D3 and JavaScript to create the visualizations as well as to control allthe user interaction with the application. The graphics and user interface are all stored entirely on the client side,allowing for seamless transitions of the graphics. See Figure 4 for a diagram of the ways Shiny and D3 are used inCommuniD3.Figure 4: Diagram of the uses of D3 and Shiny in CommuniD3, specifically focusing on client versus serverutilization. D3 and JavaScript are used to create the visualizations, as well as control the user interaction. Shiny and the associated R packages are the framework on which the application is built.We also leveraged other R packages to help with data manipulation. We used plyr (Wickham, 2011), reshape2 (Wickham, 2007), and rjson (Couture-Beil, 2013) to split and aggregate metric values according to the levelsselectable by the user before passing the data to the client side in the JSON format. For subsequent analysis afterusing CommuniD3 we used the R packages ggplot2 (Wickham, 2009) and maps (Becker and Wilks, 2013) to divedeeper into the interesting findings from the application. In this section, we illustrate a specific example of how CommuniD3 can be used to highlight interesting features inthe data. We then proceed by showing other interesting findings in the data, aided by the use of CommuniD3.We elected to divide our analysis using two primary factors. The first was the geographic region the community waslocated in, and the second was the urbanicity of the particular community. Urbanicity is a census designation whichwas provided in the dataset, while regions were determined by us. The interactive tool was then used to help usdiscover a story in the data for each of the five regions.
The five communities comprising the Great Plains were Grand Forks, ND, Duluth, MN, Aberdeen, SD, Saint Paul,MN, and Wichita, KS. Through use of CommuniD3, we quickly discovered that the individuals in this region rated6he quality of education in the community more highly. Figures 5, 6, and 7 illustrate the sequence of steps inCommuniD3 which led to that conclusion.You recommend wish to follow along in CommuniD3 as we describe the steps taken. First, we click on Saint Paul,as shown in panel (1). We examine the bar charts of Community Attachment displayed in panel (2), which indicatesthe Great Plains region has an overall higher attachment than the average of all other cities.Figure 5: (1) Select Saint Paul (2) Focus on the comparison of Community Attachment in Saint Paul with othercities that have geographical or cultural similarities. Conclude that Saint Paul is more attached than most of itscounterparts.We can then look at the plot of correlations of panel (3), demonstrating that the Great Plains communities have anoverall larger correlation between Education and Community Attachment than the average of all communities. Thisleads to switching the metric of interest in CommuniD3 to Education, and an examination of the ordered dot plotof panel (4). This plot makes clear that while Saint Paul is strong in education (ranking fourth of all communities),three other communities particularly stand out. 7igure 6: (3) Focus on the correlation between Community Attachment and the other metrics in Saint Paul toidentify Education as particularly important. (4) Compare the mean value for education in Saint Paul with the othercities.At this point, the story of the Education metric in the Great Plains becomes clear. Panel (5) illustrates the barchart that was first shown in panel (2), but this time with regards to Education. Saint Paul, despite rankingfourth overall, actually has a slightly lower value for this metric than does the Great Plains communities aggregated.This is explained by the fact that the Great Plains communities stand above the rest, averaging 2.2 out of 3 inEducation compared to about 2.0 for the average of all cities. Our final step is to click on the Great Plains bar,which immediately highlights the Great Plains communities in the ordered dot plot as shown in panel (6).Figure 7: (5) Focus on how the mean value for Education is larger than the mean value for all communitiesaggregated. (6) Examine the comparison of all cities to observe that three of the top four communities in Educationare located in the Great Plains.It is quickly evident that Grand Forks, Aberdeen, and Saint Paul comprise three of the top four communities in theEducation metric, helping to explain the fact that the Great Plains has the overall largest Community Attachmentamong the regions we considered. 8 similar approach was taken for the other four regions, leading to the conclusions presented in this paper. Henceforth,we illustrate the findings in static graphics and tables for ease of presentation, all of which were motivated anddiscovered through use of CommuniD3.
Another focus of our analysis was the influence of the Urbanicity designations of the communities. In the West region,which comprises Boulder, CO, San Jose, CA, and Long Beach, CA, we found some evidence of an urbanicity-specificmetric correlated with attachment.Table 2 displays the top five communities by the Openness Metric. First, it can be seen that the three communitiescomprising the West Region were all in the top five for Openness. Second, Boulder and Long Beach each havethe urbanicity of ”Very high urbanicity-medium population”. These communities consist of a relatively modestpopulation, but where most of whom live in the urban core of the city. Figure 8 displays a two-dimensional binplot of Community Attachment versus Openness, displaying only the communities with the designation ”Very highurbanicity-medium population”. Areas of darker red have a higher frequency of responses than those that are white.Notice that both Boulder and Long Beach had more respondents indicating a high Community Attachment anda high Openness. By comparison, Akron, OH, and Gary, IN, two communities in the Rust Belt region with thesame urbanicity, had much lower ratings on both of these scales. Bradenton, FL had many citizens highly attachedto the community, but somewhat lower ratings for Openness compared to Boulder and Long Beach. Ultimately,communities in the West region with this urbanicity designation placed a higher value on the Openness of theircommunity than do other communities of similar size in the rest of the country.Community Region Urbanicity OpennessLong Beach, CA West Very high urbanicity-medium population 1.95San Jose, CA West Very high urbanicity-large population 1.88St. Paul, MN Great Plains Very high urbanicity-large population 1.88State College, PA Rust Belt Medium/low urbanicity-low population 1.87Boulder, CO West Very high urbanicity-medium population 1.84Table 2: Top Five Communities by the Openness Metric. Note that three of the communities are from the Westregion 9 oulder, CO Akron, OH Bradenton, FLLong Beach, CA Gary, IN1234512345 1.0 1.5 2.0 2.5 3.0 1.0 1.5 2.0 2.5 3.0
Openness C o mm un i t y A tt a c h m en t count Figure 8: 2D binned plot of responses for Openness and Community Attachment among the five communitieswith Very high urbanicity-medium population designations. Communities in the West region with this urbanicitydesignation placed a higher value on the Openness of their community than do other communities of similar size inthe rest of the country.
Exploring trends in the Deep South communities of Macon, GA, Milledgeville, GA, Columbus, GA, Tallahassee,FL, and Biloxi, MS quickly suggested that residents of these communities were displeased with the Safety of thecommunity. In 2008, Macon and Columbus ranked third and fourth worst respectively among all 26 communities interms of Safety. By 2010, the situation degraded further, as Macon declined to the overall worst Safety rating, whileMilledgeville ranked third worst, and Columbus remained the fourth worst. Biloxi was a notable exception, however,ranking eighth best in 2010. Biloxi also exceeded its fellow Deep South communities in terms of Social Offerings,ranking second best amongst all communities in each of the three years the survey was conducted, and by far thebest amongst the communities in the Deep South.As it turns out, Biloxi also had the highest overall Community Attachment rating in the Deep South. Figure 9 displays10he average rating from 2008 to 2010 in terms of Safety, Social Offerings, and Community Attachment. Biloxi ishighlighted in red, the other Deep South communities are highlighted in blue, and the rest of the communities arehighlighted in gray. The stark difference between Biloxi and the rest of the Deep South is readily apparent, andhelps to explain why Community Attachment was quite high in Biloxi.
Columbus, GAMacon, GAMilledgeville, GATallahassee, FLBiloxi, MS Columbus, GAMacon, GAMilledgeville, GATallahassee, FLBiloxi, MS Columbus, GAMacon, GAMilledgeville, GATallahassee, FLBiloxi, MS
Safety Social Offerings Community Attachment1.61.82.02.2 1.41.61.82.02.2 3.03.33.63.94.22008 2009 2010 2008 2009 2010 2008 2009 2010
Community aa aa aa
Biloxi, MS Deep South Other
Figure 9: Responses across the three years for Safety, Social Offerings, and Community Attachment. The starkdifference between Biloxi and the rest of the Deep South in terms of Safety and Social Offerings is readily apparent.This helps to explain why Community Attachment was quite high in Biloxi.
In the Deep South, we saw some evidence suggesting Biloxi’s high rating for Social Offerings may have contributed toa strong sense of attachment in that community. Nowhere is this phenomenon more prominent than in the Southeastregion, where Myrtle Beach, SC is located. Myrtle Beach was the fifth most attached community amongst allcommunities in the dataset. However, Myrtle Beach did no better than 13th in all other metrics with the exceptionof Social Offerings, where it was ranked first. In other words, residents of Myrtle Beach felt the Social Offerings intheir community were very strong, while most other metrics, including Aesthetics, Openness, Safety, and Educationwere poor. Figure 10 illustrates this phenomenon with a parallel coordinate plot. The mean value for all metricsis displayed for each of the communities, with Myrtle Beach highlighted in green, and other Southeast communitieshighlighted in blue. The metrics are sorted from high to low by the average value for each metric amongst all thecommunities. Notice that Myrtle Beach fairly closely tracked the rest of the communities in all metrics except forSocial Offerings, where a sizable “jump” can be observed.11 ary, IN Grand Forks, NDGary, IN Aberdeen, SD Detroit, MI234 C o mm un i t y A tt a c h m en t A e s t he t i cs S o c i a l C ap i t a l C i v i c I n v o l v e m en t E du c a t i on S o c i a l O ff e r i ng s O penne ss S a f e t y B a s i c S e r v i c e s E c ono m y Leade r s h i p M e t r i c V a l ue − A gg r ega t ed Y ea r s Community
Myrtle Beach, SC Southeast Other
Figure 10: Mean value of metric for each community. Myrtle Beach is highlighted in green, and the other communitiesin the Southeast are highlighted in blue. Myrtle Beach stands closely tracked the rest of the communities in all metricsexcept for Social Offerings, where they were the highest rated.The question remains of how Myrtle Beach had such highly attached citizens when only Social Offerings appearedto be a positive metric for the community. The reason is that Social Offerings was the single most highly correlatedmetric with Community Attachment in 23 out of the 26 communities, including Myrtle Beach.
As the data covered the period from 2008 to 2010, we hoped to find some stories relating to the economic collapse,or the Great Recession, which began in 2008. We focused on the Economy metric in the Rust Belt communities,and noted a largely negative view of economic conditions in this region, particularly in its economic center, Detroit,MI. Table 3 displays the average response on a 1-3 scale for Economy in the Rust Belt communities across each ofthe three years. It can be seen that although all Rust Belt communities experienced a drop in attitude about theeconomy between 2008 and 2009, Detroit’s was noticeably smaller. By 2010, attitudes about the economy beganto improve. This can also be seen by examining Figure 11, a density plot of values for Economy in each of thecommunities in the three years. 12ommunity 2008 2009 2010Detroit, MI 1.26 1.25 1.37Gary, IN 1.50 1.28 1.37Akron, OH 1.41 1.32 1.41Fort Wayne, IN 1.50 1.36 1.51Philadelphia, PA 1.60 1.42 1.49Lexington, KY 1.69 1.53 1.64State College, PA 1.65 1.59 1.72Table 3: The average value for the Economy metric in the five Rust Belt communities in each of the three years,sorted by the aggregated mean value for all three years. Notice the large drop in economic outlook for all cities inthe Rust Belt from 2008 to 2009. Detroit, however, dropped only 0.01 in average economic outlook in that timespan. D e t r o i t, M I G a r y , I N A k r on , O H F o r t W a y ne , I N P h il ade l ph i a , PA Le x i ng t on , KYS t a t e C o ll ege , PA Economy D en s i t y o f R e s pon s e s Figure 11: Density of responses for Economy in Rust Belt communities over each of the three years. It appears thatDetroit experiences a modest recovery in terms of economic outlook from 2009 to 2010 that the other communitiesin the Rust Belt did not experience. 13hat this table and plot suggests is that although it is widely believed that Detroit was hit especially hard by theeconomic collapse, there was a bit of resillience in the 2009 and 2010 time frame. Attitudes in Detroit were lowregarding the economy in 2008, but did not exhibit much worsening in 2009, and began to improve in 2010. Perhapsthe automotive bailouts, which were passed and signed into law in 2009, may be a reasonable explanation for whythis occurred in America’s “Motor City”.
CommuniD3, with its use of linked plots and user interaction, allows for the discovery of features and trends inthe data that are far more difficult to discover with static plots alone. By creating the tool prior to analyzing thedata, we were able to find stories in the data that may not have been as readily apparent otherwise. Specifically, weuncovered an especially strong link between quality of education and Community Attachment in the Great Plains,the importance of Social Offerings in Myrtle Beach and Biloxi, and slowly recovering economic attitudes in the RustBelt.While the interactive tool is specialized for this particular dataset, the philosophy and ideas behind its creation holdfor other data and applications. By empowering the user to guide his or her own discoveries, the analysis of datacan be completed by subject-matter experts in their fields who may be less technologically inclined. The flexibilityand ease of
Shiny , combined with the interactivity of D3 will hopefully open up a whole new set of possibilities foranalyzing complex datasets more easily. 14 eferences Becker, Richard A. and Allan R. Wilks (2013). maps: Draw Geographical Maps . R package version 2.3-6. url : http://CRAN.R-project.org/package=maps .Bostock, Michael, Vadim Ogievetsky, and Jeffrey Heer (2011). “D3: Data-Driven Documents”. In: IEEE Trans.Visualization & Comp. Graphics (Proc. InfoVis) . url : http://vis.stanford.edu/files/2011-D3-InfoVis.pdf .Buja, Andreas et al. (1991). “Interactive data visualization using focusing and linking”. In: Visualization, 1991.Visualization’91, Proceedings., IEEE Conference on . IEEE, pp. 156–163.Couture-Beil, Alex (2013). rjson: JSON for R . R package version 0.2.13. url : http : / / CRAN . R - project . org /package=rjson .Foundation, The Knight (2010). Scorecard . url : http://issuu.com/knightfoundation/docs/knight-communities-overall?viewMode=magazine&mode=embed (visited on 03/10/2014).RStudio Inc. (2013). shiny: Web Application Framework for R . R package version 0.4.0. url : http : / / CRAN . R -project.org/package=shiny .The Knight Foundation (2014). Soul of the Community . url : (visited on03/03/2014).The United States Census Bureau (2014). Census Regions and Divisions of the United States . url : (visited on 01/22/2015).Tukey, John W. (1977). Exploratory Data Analysis . Addison-Wesley.Wickham, Hadley (2007). “Reshaping Data with the reshape Package”. In:
Journal of Statistical Software url : .— (2009). ggplot2: elegant graphics for data analysis . Springer New York. isbn : 978-0-387-98140-6. url : http ://had.co.nz/ggplot2/book .— (2011). “The Split-Apply-Combine Strategy for Data Analysis”. In: Journal of Statistical Software url :