Planet Four: Probing Springtime Winds on Mars by Mapping the Southern Polar CO 2 Jet Deposits
K.-Michael Aye, Megan E. Schwamb, Ganna Portyankina, Candice J. Hansen, Adam McMaster, Grant R. M. Miller, Brian Carstensen, Christopher Snyder, Michael Parrish, Stuart Lynn, Chuhong Mai, David Miller, Robert J. Simpson, Arfon M. Smith
PPlanet Four: Probing Springtime Winds on Mars by Mapping theSouthern Polar CO Jet Deposits
K.-Michael Aye a, ∗ , Megan E. Schwamb b,c,d,e , Ganna Portyankina a , Candice J. Hansen f , AdamMcMaster h , Grant R.M. Miller h , Brian Carstensen i , Christopher Snyder i , Michael Parrish i , StuartLynn i , Chuhong Mai c,g , David Miller i , Robert J. Simpson h , Arfon M. Smith i,j a Laboratory for Atmospheric and Space Physics, University of Colorado at Boulder, Boulder, CO 80303, USA b Gemini Observatory, Northern Operations Center, 670 North A’ohoku Place, Hilo, HI 96720, USA c Institute for Astronomy and Astrophysics, Academia Sinica; 11F AS/NTU, National Taiwan University, 1 RooseveltRd., Sec. 4, Taipei 10617, Taiwan d Yale Center for Astronomy and Astrophysics, Yale University,P.O. Box 208121, New Haven, CT 06520, USA e Department of Physics, Yale University, New Haven, CT 06511, USA f Planetary Science Institute, 1700 E. Fort Lowell, Suite 106, Tucson, AZ 85719, USA g School of Earth and Space Exploration, Arizona State University, Tempe, AZ 85287, USA h Oxford Astrophysics, Denys Wilkinson Building, Keble Road, Oxford OX1 3RH, UK i Adler Planetarium, 1300 S. Lake Shore Drive, Chicago, IL 60605, USA j Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218, USA
Abstract
The springtime sublimation process of Mars’ southern seasonal polar CO ice cap features darkfan-shaped deposits appearing on the top of the thawing ice sheet. The fan material likely origi-nates from the surface below the ice sheet, brought up via CO jets breaking through the seasonalice cap. Once the dust and dirt is released into the atmosphere, the material may be blown by thesurface winds into the dark streaks visible from orbit. The location, size and direction of these fansrecord a number of parameters important to quantifying seasonal winds and sublimation activity,the most important agent of geological change extant on Mars. We present results of a systematicmapping of these south polar seasonal fans with the Planet Four online citizen science project.Planet Four enlists the general public to map the shapes, directions, and sizes of the seasonal fansvisible in orbital images. Over 80,000 volunteers have contributed to the Planet Four project,reviewing 221 images, from Mars Reconnaissance Orbiter’s
HiRISE (High Resolution ImagingScience Experiment) camera, taken in southern spring during Mars Years 29 and 30. We providean overview of Planet Four and detail the processes of combining multiple volunteer assessmentstogether to generate a high fidelity catalog of ∼ Keywords:
Mars, atmosphere, Mars, polar caps, Mars, surface, Mars, polar geology ∗ Corresponding author
Email address: [email protected] (K.-Michael Aye)
Preprint submitted to Elsevier Journal October 25, 2018 a r X i v : . [ a s t r o - ph . E P ] O c t . Introduction Mars has a predominantly CO atmosphere with pressure levels buffered by seasonal CO2polar caps [Leighton and Murray, 1966]. In the winter atmospheric CO falls as snow or condensesdirectly onto the surface, forming a seasonal ice layer with a thickness of up to 1 m, dependingon the latitude. In the spring the south polar region of Mars exhibits a host of exotic phenomenaassociated with sublimation of the seasonal CO polar cap, and sublimation winds [Smith et al.,2001] contribute to atmospheric circulation.In the south polar region images from the Mars Reconnaissance Orbiter (MRO) High Reso-lution Imaging Science Experiment (HiRISE, McEwen et al. [2007]) document activity best de-scribed by the “Kieffer” model [Hansen et al., 2010; Kieffer, 2007; Piqueux et al., 2003a]:1. Over the winter CO anneals to form a translucent slab of impermeable ice. Penetration ofsunlight through the CO ice, which warms the ground below, results in basal sublimationof the ice.2. The laboratory measurements done by Hansen [2005] show that up to 70 % of the solarenergy that reaches the top surface of a 1 m thick slab layer can be transmitted through it.Recent laboratory experiments by Kaufmann and Hagermann [2016] were able to triggerdust eruptions from a layer of dust inside a CO ice slab under Martian conditions, lendingfurther credence to the proposed CO jet and fan production model.3. Trapped gas escapes through ruptures in the ice, eroding and entraining material from thesurface below [de Villiers et al., 2012].4. When this dust-laden gas is expelled into the atmosphere the dust settles in fan-shaped de-posits on the top of the ice in directions oriented by the ambient wind, as shown in Figure 1[Thomas et al., 2010, 2011].5. When the layer of seasonal ice sublimates in summer, the fans fade, as the material mostlyblends back into the surface [Hansen et al., 2010].6. The compressed CO gas streams of the jets are believed to erode the surface, carvinguniquely Martian spidery channels originally identified in images from the Mars OrbiterCamera [Piqueux et al., 2003b], now referred to as araneiforms [Hansen et al., 2010].The number, time history, area covered and changes in direction of the fans provide a wealthof information on the spring sublimation process and spring winds. Apart from few wind directionestimations from remotely observed dunes [Ewing et al., 2010] and surface rover wind measure-ments [Greeley et al., 2006; Newman et al., 2017], no wide spread wind measurements exist forMars. The science goals enabled by cataloging fan measurements fall into two categories:1. Enhance our understanding of spring winds and provide constraints for global and mesoscalecirculation models. The length, width, and direction of these fans are snapshots in time ofthe local wind direction. Changes in the orientation of the fans over time records changes inwind direction. These markers can be compared to predictions from global and mesoscale2irculation models (e.g. Smith et al. [2015]) to improve our understanding of Mars’ weatherin the polar regions. Dust injected into the atmosphere can be estimated.2. Extend our understanding of the sublimation process and its efficacy as an agent of changeon the Martian surface. The number of fans as a function of time record sublimation activitywhile the overlying ice thickness and insolation change during the season. The areal cover-age of the fans allows us (with reasonable assumptions about particle size) to estimate theamount of material eroded from the surface on seasonal timescales. Inter-annual variabilityand the relationship of timing of seasonal activity to global dust storms can be quantifiedwith this data-set (These are topics of future papers).Although the value of this data-set is clear, the sheer number of fans (on the order of hundredsof thousands) present in HiRISE images from multiple locations and times observed over manyMars years has proven to be a daunting data-set to catalog. Attempts at developing automated de-tection algorithms have been unsuccessful at identifying the locations and shapes of these seasonalfans in images from orbit in a reliable fashion [Aye et al., 2010]. However, there is an increasinginterest to use the outcomes of Citizen Science projects as training data for neural networks (e.g.Alger et al. [2018]; Banerji et al. [2010]; Bird et al. [2018]; Bowley et al. [2018]; Nguyen et al.[2018]; Peng et al. [2018]), hence we believe that these two lines of research will become stronglycomplimentary in the near future.The task of mapping the dark fans is simply pattern recognition, and the human brain is ideallysuited for this task, easily capable of spotting and outlining these features. With the advent of theInternet, tens of thousands of people across the globe can be enlisted to assist scientists with tasksthat are impossible to automate. This citizen science or crowd-sourcing approach, where indepen-dent assessments from multiple non-expert classifiers are combined, has become an establishedtechnique as the data volumes have continued to grow. This method has been applied to nearlyall areas in astronomy and planetary science [Marshall et al., 2014] (see reference therein) includ-ing galaxy morphology [Lintott et al., 2008; Willett et al., 2013], identification of planet transits[Fischer et al., 2012; Schwamb et al., 2012], crater counting [Bugiolacchi et al., 2016; Robbinset al., 2014] and to a sister project of the here presented efforts, Planet Four: Terrains [Schwambet al., 2017b]. In collaboration with the Zooniverse [Fortson et al., 2012; Lintott et al., 2011], thelargest collection of online citizen science projects, we have developed Planet Four , a web portalto enlist the general public to identify and map the seasonal fans in HiRISE images of Mars’ polarregions.In this paper we present the first results from the Planet Four project, a catalog of seasonal fansfrom two Mars years, MY 29 and 30, of HiRISE monitoring of the Martian South Polar region.In Section 2, we provide an overview of the HiRISE South Pole Seasonal Processes MonitoringCampaign and the specific HiRISE observations used in this study. In Section 3, we present thePlanet Four project and the online classification interface. Section 4 details the process for as-sessing and combining the volunteer classifications to create a catalog of seasonal features. In igure 1: Subsection of HiRISE image
ESP_011960_0925 , taken at (LAT, LON) −87.303°, 167.970°; L s
2. HiRISE Instrument and Seasonal Processes Monitoring Campaign
The
Mars Reconnaissance Orbiter (MRO) has the ability to turn off nadir to target a specificlocation. In its inclined orbit there are numerous opportunities to achieve repeat coverage in thepolar region. In order to study seasonal processes the HiRISE team selected a limited number ofregions of interest (ROIs) in the Martian south polar region to image throughout the spring season.Time is defined on Mars by the orbital longitude L s , where southern spring begins at L s =180°.Originally, the HiRISE monitoring campaigns were numbered by their ordinal number of sea-sons the MRO mission had been observing Mars. This work focuses on the observations fromseasons 2 and 3 which have more regular repeat HiRISE imaging of ROIs over multiple years,compared to season 1 HiRISE monitoring campaign. To be able to compare with other missionsand modeling, we also identify our data using the convention of Martian years, established byClancy et al. [2000] and Piqueux et al. [2015], where Mars Years 29 and 30, also written as MY294nd MY30, correspond to HiRISE seasons 2 and 3. Every day, citizen scientists are making morefan measurements for later Mars years and the catalog continues to grow. The longer timespancovered by the catalog will be discussed in future paper(s).Figures 4 and 5 provide an overview of the observed locations and times in solar longitudes ofthe HiRISE data used in this work. Table 1 lists the ROIs selected for analysis using Planet Four.221 high quality images from southern spring season 2 and 3 (i.e. MY 29 and 30) were selectedfor analysis on Planet Four (see Table 2). The reduced HiRISE products were obtained from theNational Aeronautics and Space Administration’s (NASA) Planetary Data System (PDS) HiRISEPDS Data Node .HiRISE is a pushbroom imager. It has ten 2048-pixel detectors in the cross-track direction,which covers ∼ ∼ −1 . A typical size image has ∼ ×
18) km area. Color is available in the center 20 % of the image. A full description of thecamera is found in McEwen et al. [2007].It is generally easier to identify the fans in the color portion of the image, so only the ∼ ×
648 pixel sub-images that we will refer to as “tiles”. To avoid edge effects, the tiles are generated such that thereis a 100-pixel overlap with the neighboring tiles. We avoid showing volunteers tiles where part ormost of the tile is blank. Due to the variable length and width of HiRISE images, there is typicallya small region on the right and bottom edges of the non-map projected HiRISE image that cannotbe made into a full-sized tile and thus is not searched for seasonal features with Planet Four. Pixelsampling scales per tile are typically 24.7 cm/pixel when HiRISE is in 1 × × ×
4. For the seasons 2and 3 monitoring campaign, a HiRISE image is associated with 36 to 635 tiles (see Table 2). Forthe analysis presented here 23,723 tiles derived from 129 full frame HiRISE season 2 monitoringimages and 19,181 tiles derived from 92 season 3 HiRISE images were reviewed by Planet Fourvolunteers. A characteristic sample of Planet Four tiles is presented in Figures 2 and 3. http://hirise-pds.lpl.arizona.edu/PDS/ atitude Longitude Informal Name -73.53 339.5 Binghamton 2 0-74.22 168.5 Caterpillar 1 0-81.38 295.8 Inca City 7 7-81.46 296.3 Inca City Ridges 7 8-81.68 66.3 Potsdam 7 9-81.80 76.1 Starburst 7 3-81.93 60.4 Albany 5 0-81.9 4.8 Buenos Aires 7 7-82.2 225.2 Wellington 2 0-82.3 306 Taichung 1 0-82.5 80.0 Buffalo 2 0-82.69 273.1 Cortland 1 0-83.2 158.4 Rochester 4 0-84.82 65.7 Giza 11 7-85.0 95.0 Schenectady 1 0-85.02 259.0 Troy 1 0-85.13 180.7 Ithaca 10 6-85.18 92.0 Geneseo 0 1-85.4 103.9 Macclesfield 7 7-86.25 99.0 Manhattan Cracks 1 5-86.39 99.0 Manhattan Classic 8 9-86.8 178.0 Písaq 3 1-86.98 169.7 Atka 3 0-86.99 99.1 Manhattan Frontinella 5 3-87.0 72.3 Halifax 3 0-87.0 86.4 Oswego edge 6 10-87.0 127.3 Bilbao 7 3-87.3 167.8 Portsmouth 5 6 Table 1:
Regions of interest studied with Planet Four that were monitored during both seasons 2 (Mars Year 29) and 3(Mars Year 30) HiRISE Southern Seasonal Processes Campaign. A full list of the images is available as supplementaldata in the file
P4_catalog_v1.0_metadata.csv
The Latitude and Longitude values are the mean value over thecenter latitudes and longitudes of the respective HiRISE observations. All informal names are internal designationsused by the Planet Four team and not approved by the International Astronomical Union. igure 2: Randomly selected sample of Planet Four tiles characteristic of the season 2 and season 3 HiRISE moni-toring campaign. Each tile has 840 ×
648 pixels, but its ground resolution varies with HiRISE binning modes. This isreflected in the map_scale column of the Planet Four catalog files. igure 3: Randomly selected sample of Planet Four tiles characteristic of the season 2 and season 3 HiRISE moni-toring campaign. Each tile has 840 ×
648 pixels, but its ground resolution varies with HiRISE binning modes. This isreflected in the map_scale column of the Planet Four catalog files. igure 4: Map overview of the regions of interest for the seasonal monitoring campaign of HiRISE. For readability,the following regions are shown as cyan-colored unlabeled dots: Inca City Ridges, Schenectady, Troy, ManhattanCracks, Manhattan Classic, Atka, Halifax, Oswego edge.
Figure 5:
Temporal and latitude coverage for the season 2 and season 3 HiRISE monitoring campaign observationsreviewed on Planet Four. bservation ID Latitude Longitude L s Start Time North
ESP_011296_0975 -82.197 225.253 178.8 2008-12-23 110.6 91ESP_011341_0980 -81.797 76.13 180.8 2008-12-27 110.2 126ESP_011348_0950 -85.043 259.094 181.1 2008-12-27 123.6 91ESP_011350_0945 -85.216 181.415 181.2 2008-12-27 99.7 126ESP_011351_0945 -85.216 181.548 181.2 2008-12-27 128.0 91ESP_011370_0980 -81.925 4.813 182.1 2008-12-29 110.6 126ESP_011394_0935 -86.392 99.068 183.1 2008-12-31 139.4 72ESP_011403_0945 -85.239 181.038 183.5 2009-01-01 106.5 164ESP_011404_0945 -85.236 181.105 183.6 2009-01-01 134.1 91ESP_011406_0945 -85.409 103.924 183.7 2009-01-01 111.3 126ESP_011407_0945 -85.407 103.983 183.7 2009-01-01 138.8 91ESP_011408_0930 -87.019 86.559 183.8 2009-01-01 148.9 59ESP_011413_0970 -82.699 273.129 184.0 2009-01-01 112.8 108ESP_011420_0930 -87.009 127.317 184.3 2009-01-02 157.3 54ESP_011422_0930 -87.041 72.356 184.4 2009-01-02 157.0 54ESP_011431_0930 -86.842 178.244 184.8 2009-01-03 148.6 54ESP_011447_0950 -84.805 65.713 185.5 2009-01-04 113.0 218ESP_011448_0950 -84.806 65.772 185.6 2009-01-04 138.8 59
Table 2:
Partial table of used HiRISE observations to indicate spatial and temporal coverage. Full table published inthe online version. The center coordinates for all HiRISE pointings used in this study. Latitudes are planeto-centricand the given north azimuth angle is for the non-map-projected data that went into the Planet Four system.
3. Planet Four
Here we describe the Planet Four classification interface and the information generated byvolunteers visiting the Planet Four website.
Planet Four volunteers are asked to identify and outline fans in the presented tiles. Sometimesthe fan has an indeterminate direction, in which case we call them “blotches”. Although less use-ful for wind regime studies the blotches are sites where the ice has ruptured and released material,so they are important to studying the sublimation process of the polar CO ice sheet. Thus, vol-unteers are asked to identify and mark blotches as well. Positions, orientations, and sizes of fansand blotches are obtained via a web interface (see Figure 6) built upon the Zooniverse’s Applica-tion Programming Interface (API), which communicates with their custom built Ouroboros webplatform (described in Appendix A). Each tile is assessed by approximately 30–100 independentreviewers. To ensure reviewers have no prior information that may influence their judgment, tilesare randomly served to the classifier, and no identifying information about the parent HiRISE im-age is presented in the Planet Four web interface. The volunteer is blind to the location on theSouth Pole, time of season the observation was taken, and responses from other classifiers while10eviewing a given tile. Planet Four was launched originally in English; later on the websites, clas-sification interface, and help material have also been translated into several languages , includingtraditional and simplified character Chinese, German, and Magyar (Hungarian). For the analysespresented here, all Planet Four classifications are treated the same, regardless of what language thevolunteer was using in the classification web interface. First time visitors to the Plant Four website are presented with a short inline interactive tutorialthat explains the task and guides the classifier on how to use the marking tools. Additional trainingmaterial is also available elsewhere on the site. The tutorial is shown only once for those classifiersusing the Planet Four web interface logged-in with a registered Zooniverse account. Volunteersusing the site in the non-logged-in mode, are presented with the tutorial each time they visit thePlanet Four website. Other than the frequency of the tutorial appearing, the user experience onPlanet Four, including the tutorial content, are exactly the same for logged-in or non-logged involunteers.
Fans and blotches are drawn by selecting the appropriate tool in the classification interface (seeFigure 6), clicking on the tile displayed, and dragging to resize the marker to the appropriate shapeand orientation. The fan tool generates a triangle with a rounded base with the user controllingthe endpoint of the fan. The default opening angle for the fan marker is set to 5°. The blotchtool simply produces an ellipse with the user controlling the size and orientation of the major axis.For blotches, the default length of the minor axis is 0.75 times the pixel length of the major axisdrawn. Once a blotch or fan marking has been made, a classifier can edit the initial parametersby manipulating handles on the marker. For blotches, the length of the major and minor axes androtation can be adjusted. For fans, the opening angle, orientation, and length can be modified. Ifonly a single mouse click is made on the interface, than the minimum sized fan or blotch marker isproduced: a fan with a length of 10 pixels and an opening angle of 1° or an ellipse with both axesequal to 10 pixels. Additionally, there is an ‘Interesting Feature’ tool available for volunteers tohighlight the position of anything that they deem worth review by the Planet Four Science Team.The Interesting Feature marker is not resizable. All markers drawn in the web interface can berepositioned or removed by the classifier. 11 igure 6:
The fan (above) and blotch (below) marker on the Planet Four tutorial image. Black circles and diamondsare the marker handles that can be used to adjust the shape and orientation in the web classification interface. The “x”is used to delete the marker. .2. Classification Database Once the volunteer is done making markings, if any, and hits the ‘Finished’ button, the clas-sification (which we define as the sum total of all the markings or lack of markings made by thevolunteer) is submitted to the Ouroboros API to be saved to a database. At this point, the classifiercan move on to view the next tile by hitting the ‘Next’ button or can choose instead to enter thePlanet Four discussion tool (discussed in further detail in Section 3.3). Once the classification hasbeen submitted, it cannot be revised. For blotches, the center position, rotation angle, and pixellengths of the major and minor axes of the ellipse are recorded. For fans, the starting position, dis-tance in pixels from the starting point to the end of the fan, opening angle, and rotation angle aresaved to the database. For interesting features, only the pixel location is stored. If no features aremarked, the database records the classification as a non-marking. A tile identifier and timestampfor each classification is also stored in the database.If the volunteer is logged in with a registered Zooniverse account, the classifications are trackedin the database via the associated username. For non-logged-in classifications, a unique session idis generated and used to link the classifications completed by a given IP address and web browser.The non-logged-in identifier does not exactly correspond one-to-one to a unique individual. If aperson classifiers non-logged-in and changes their IP address, their new classifications would bestored under a different identifier. Additionally, if a volunteer initially participates as a non-logged-in classifier on Planet Four and then registers for a Zooniverse account, the previous classificationsstored in the database are not linked to the Zooniverse username and remain associated with theunique non-logged-in session identifier.We note there are occasional spurious or duplicate entries stored in the classification database,typically due to a glitch in the classifiers’ browser or a minor bug in the Ourborous framework.These entries compose a very small percentage of the total volunteer classifications. They are eas-ily identified and removed from the analysis presented here. Further details are provided in Ap-pendix B. Additionally the Planet Four classification interface originally recorded a different anglethan the intended spread angle from the fan marking tool. This was identified and subsequentlyfixed in the software. The true spread angle of the fan marker drawn by the volunteers is recov-erable from the values stored recorded in the database, and we have adjusted the classificationseffected.
Associated with the Planet Four classification interface is a dedicated object-orientated discus-sion tool known as “Talk” . Each Planet Four tile assessed on the main classification interfacehas a dedicated page on the Planet Four Talk website. Volunteers can access these pages directlythrough the classification interface after submitting their classification. With Talk, volunteers canwrite comments, add searchable Twitter-like hash tags, create longer side discussions, and groupsimilar tiles together in collections. For the analysis presented here, we focus strictly on the vol-unteer markings from the main user interface, and do not include a complete analysis of the datafrom the Talk tool. http://talk.planetfour.org igure 7: Distribution of the number of Planet Four classifications for Season 2 (MY29) and Season 3 (MY30) tileswith a bin size of 5. The distribution peaks at the two different retirement values of 100 and 30. Due to performanceissues in the webserver’s queueing system, the retirement values were at times not enforced, leading to the spread-outdistributions at values higher than the retirement values.
Planet Four was publicly launched on 2013 Jan 8 as part of the British Broadcasting Corpora-tion’s (BBC) Stargazing Live, three nights of live astronomy programing (2013 Jan 8–10) on BBCTwo in the United Kingdom. Review of Season 2 and 3 tiles span from January 2013 to March2015 with 9,809,637 classifications produced in total. The majority of classifications for Seasons2 and 3 were obtained during the BBC Stargazing period, but subsequently data from HiRISE’sother seasonal monitoring campaigns were mixed with the Season 2 and Season 3 classifications.The results from data outside season 2 and 3 which are still in the process of being reviewed on thePlanet Four website will be the topic of subsequent publications. Figure 7 plots the distributionof classifications per tile for Seasons 2 and 3. Due to the high classification rate at launch, tileswere set to retire from rotation in the web interface after 100 independent assessments (countingduplicates) to ensure that the project would continue to serve data over the Stargazing period. Overtime the classification rate dropped significantly from launch, and on 2013 Dec 9 the retirementthreshold for a tile was lowered to a more reasonable — and statistically acceptable — value of30 to better accommodate the actual work rate on Planet Four. This value is similar to the imageretirement threshold that was used by the Zooniverse’s Milky Way Project [Simpson et al., 2012],which enlists the general public in a similar task, drawing circles on space-based infrared imagesto identify the shape and size of star formation bubbles. )b)
Figure 8:
Distribution of volunteer classifications. Figure a shows the combined distribution tallied together for bothlogged-in and non-logged in sessions. Figure b shows the volunteer classification count individually for registeredand non-logged volunteers. Both histograms use a bin size of 2. . Data reduction In order to create fan and blotch object catalogs from the Planet Four markings, a reductionpipeline was implemented, for which the code is open source and made available . The pipelineis based on the Python programming language, interfacing also to the US Geological Survey’s(USGS) Integrated Software for Imagers and Spectrometers (ISIS) [Anderson et al., 2004; Beckeret al., 2007], and making use of the “scikit-learn” package for machine-learning related tasks [Pe-dregosa et al., 2011]. This data reduction pipeline has five main conceptual stages (see Fig. 9): Cleanup , where the Planet Four classification data is cleaned, normalized and converted to a bi-nary database (Section 4.1),
Clustering , where the markings of the many different volunteers arebeing combined into, ideally, one resulting average object (Section 4.2),
Combination , where wecombine fans and blotches markings that seem to address the same visible object in the imageinto a meta-object for further processing during the next stage (Section 4.3),
Thresholding , wherea cut on the required number of volunteers that voted for either fan or blotch will decide if thepreviously created meta-object should be considered a fan or a blotch (Section 4.3.1), and finally
Ground Projection , where we project the HiRISE image pixel coordinates of the resulting fan andblotch markings into latitude and longitude coordinates on Mars (Section 4.4). The pipeline is located at https://github.com/michaelaye/planet4.
DatabaseCleanup Clusteringper fans/blotches Thresholdingdecides between fan and blotchFinal fan/blotch catalog
Fan & Blotchoverlapping?
Create meta-object with marking weightsyesGroundProjectionno
Figure 9:
Overview of conceptual steps of the Planet Four data reduction pipeline. .1. Database Cleanup After the removal of the tutorial data (see 3.1.1), and a first cleaning for spurious, incompleteand duplicate classification database entries (see Section Appendix B), we normalize all anglesfrom the Planet Four classification interface, and finally produce a binary database in the format ofHDF5 (Hierarchical Data Format, version 5) for the remainder of the data processing. Normalizingof angles is required because the Planet Four system records blotches with an angular range from-180 to 180 while ellipses possess a degree-2 rotational symmetry. This means only the range of 0to 180 degrees is required to fully describe blotches, once the radii are sorted in a consistent way(semi-major axis first). Volunteers randomly start to draw the ellipses required to mark blotcheseither from the semi-minor axis or the semi-major axis, making it error-prone to cluster on theseparameters without normalization. The cleaned raw Planet Four classifications as used by thiswork’s analyses are provided as supplemental data to this work in the file
P4_catalog_v1.0_raw_classifications.csv . Further details about the format of the raw classifications are describedin Appendix C.
We identify fans and blotches by combining together the multiple volunteer assessments fromeach Planet Four tile. To identify and precisely locate the marked features from the multipleclassifications performed by many (between 30–100, see Appendix A) volunteers per Planet Fourtile, we perform a clustering analysis on the data. Figure 10 shows an example of fan markingsfor a Planet Four tile. After having evaluated several different clustering algorithms, we haveidentified the Density-based Spatial Clustering of Applications with Noise (DBSCAN) clusteringalgorithm of Ester et al. [1996] as the most appropriate one for our application. DBSCAN has theadvantage of not requiring the number of expected clusters as input, instead it is controlled by twoinput parameters describing the minimum number of members of a cluster ( min_samples ) andthe maximum distance for a data point to be included into a cluster ( epsilon ). (Details on howwe determine these parameters are described in Section 4.2.1.) We set up our clustering pipelineusing the DBSCAN implementation in the scikit-learn Python library [Pedregosa et al., 2011]. Allvolunteer responses are treated the same with equal weight in the clustering algorithm. Due to thedifferences in the classification interface for marking fans and ellipse-shaped blotches — fans aredrawn from a base point vs blotches drawn from the center — the fans and blotch markings areclustered separately at this stage, and require their own set of clustering parameters.In a first stage, we cluster the data for Planet Four tiles each on the (x,y)-pixel-coordinates ofthe base point of fans and of the center for blotches (see Fig. 12 for a visual description of the avail-able coordinates of the markings.). Figure 13 shows the result of clustering in two dimensions ofthe x and y base coordinates of the fan markings, using a multi-step approach as shown in Fig. 11,as described below. Once the clusters for a given set of parameters (see Section 4.2.1 for detailson the parameter tuning) have been defined, the original marking data for each cluster membersare averaged to create one average marking object per cluster, including average directions for fanobjects, e.g. in Fig. 13. The number of markings that went into the creation of the averaged objectis stored for later.After having clustered both fans and blotches on their base and center coordinates respectively,we apply a second stage of clustering on the markings. For fan deposits, the major objective of this18 igure 10:
Fan markings for Planet Four tile
APF00001cl of HiRISE image
ESP_012322_0985 . Left : The cut-outtile that is shown to the Planet Four volunteers.
Right : 51 different users have classified this image. The colors cyclethrough randomly for the markings of different users. With such a large number of different volunteers classifying,the “sensitivity” for detection is increased, as notable by a few markings that outline even the smallest potential darkdeposit candidates. However, when the “crowd” does not agree with these, i.e. if the potential cluster does not reachthe min_samples number of required members, the clustering pipeline discards these entries, as shown in Fig. 13. base coords within 10 px center coords within 10 pxrad_1 & rad_2 within 30 px center coords within 25 pxrad_1 & rad_2 within 50 pxangle within 20ºFan clustering Blotch clustering
Figure 11:
The sequence of clustering steps for both fan and blotch markings. It became apparent during our studies,that fan markings show less scatter, probably due to the tool having to be placed at a clearly identifiable base point.Blotches, however, do not show a clearly identifiable center, and their outline is often less sharply defined, creatinga wider distribution of marking results, especially for larger blotches. This required a second run of clustering withmore relaxed cluster parameters, as described in Section 4.2.1 and in Table 3. ase Point (pixels)Distance(pixels)Spread(degrees) Angle from horizontal (degrees)(0,0) Pixel position Radius_1 (pixels)xy Radius_2 (pixels) Angle from horizontal (degrees) Center Point (pixels)
Figure 12:
The different coordinates available in the Planet Four marking catalog are described here. Fans possess(x, y) base coordinates, an angle from horizontal for their pointing and a spread angle. Blotches possess center (x, y)coordinates, semi-major and minor axis radii and also an angle indicating their alignment towards the horizontal.
Figure 13:
Fans from Figure 10 for Planet Four subject ID
APF00001cl after applying our clustering pipeline.
Left:
For direct comparison, this shows the same as Fig. 10 on the right, on page 19.
Right:
Results after clustering,identification of noise markings, and averaging the cluster members’ data into one object per cluster. Markings thatdo not become member of a cluster are defined as noise and will be discarded from further processing (shown as whitedots). igure 14: Planet Four tile
APF0000de3 from HiRISE image
ESP_011961_0935 . It shows the prevalence andprecise identification of CO jet deposits with multiple directions that start from the same base point, indicatingmultiple eruptions under different wind directions. The large fan is the second longest recorded in the catalog, with alength of approx. 368 m. work is to determine the wind direction they indicate. Due to this we want to be able to distinguishbetween different wind directions from the same source point, i.e. multiple subsequent eruptions,where later eruptions occurred with a different prevalent wind direction. In the Planet Four helpcontent we have emphasized that the volunteers should outline several fans if they appear to startfrom the same source point. This is very relevant for data like that in Fig. 14, to identify severalwind directions indicated by the fans, from multiple subsequent jet eruptions. By clustering notonly on the base coordinates (x, y) but also on the recorded alignment angle of the fan markings,we are able to distinguish these subsequent fan deposits with different wind directions.We have determined by reviewing the clustering results of a subset of the data that 20 degreesas a clustering value for angles enables this objective. It means that fan markings that have analignment angles further away from each other than 20 degrees are clustered into their own sub-cluster, even if they start at the same base point. Blotches, on the other hand, are used for depositsthat do not clearly indicate a direction, which is why we do not apply an angle clustering here.However, blotches do not show a clearly identifiable center, and their outline is often less sharplydefined, creating a wider distribution of marking results, especially for larger blotches. Thus, wecluster also on the resulting ellipse radii for the blotches to ensure that we identify the statisticallymost common shape of the volunteer’s blotch markings.The values of the clustering parameters strongly influence the number of identified features.We therefore studied extensively, how precisely they affect our results by reviewing random sub-sets of the data-set, which led to the empirical determination of the clustering parameter values21 arking Dimension Small Large Fans xy (base) 10 px NAangle (deg) 20 NABlotches xy (center) 10 px 25 pxradius (px) 30 px 50 px
Table 3:
Empirically determined epsilon values for the clustering pipeline. NA: Fan markings did not require asecond clustering run with relaxed precision on the distance, apparently the fact that a fan requires drawing froma distinguishable starting point helped the volunteers to keep the scatter small, both in base coordinates and angleprecision. that we eventually used for the catalog production. These procedures will now be discussed inthe following sections (see Fig. 15 for an example of reviewing parameter values). The results ofthe clustering stage are then shown in the lower right (blotches) and lower middle (fans) parts ofFigures 16 and 17. min_ samples . As described in Section 3.2, Planet Four tiles have varying numbers of user clas-sifications, thus the classifications for each Planet Four tile are clustered separately, with a variablerequirement on the min_samples clustering parameter. More classifications for a Planet Four tilemeans that we have a higher “sensitivity” to smaller features (see for example Fig. 10, right), soto achieve a uniform detection efficiency, we implement a scaling factor on the required numberof samples per cluster. This results both in a higher sensitivity to have seasonal fans and blotchesmarked and higher precision averaged objects at the end of the clustering process. In other words,the signal-to-noise ratio (SNR) is higher for a Planet Four tile that was classified by a larger numberof volunteers and we adapted the clustering process to normalize for that fact.To address the variable SNR in our data, we empirically determined a scaling factor min_samples_factor (MSF) that, multiplied with the number of classifications that contain blotch orfan markings, results in the min_samples value for the DBSCAN algorithm: min_samples = round (cid:0) min_samples_factor · n markings (cid:1) , with n markings ≤ n classifications , the number of classifiers that have added either blotch or fanmarkings as classifications.The best value for MSF was empirically found to be at 0.13. For example, when a Planet Fourtile has n class =
30 classifications (our current retirement value), n class will be 4. This value nowprovides the number of cluster members min_samples that is required for a cluster to be created.When a tile has 70 submissions, however, it would result in the requirement of having 9 clustermembers to be deemed a real detection and to be entered into the next stage of the pipeline. Thisway, we are exploiting the higher sensitivity from the larger number of submitted classifications. epsilon . The second DBSCAN parameter, epsilon , describes the largest distance that twopoints are allowed to have, for them to be considered to be in the same cluster. The dimensionfor this measurement depends on what mathematical feature is currently being clustered. When22e cluster on the base point coordinates of fans, the central point coordinates or semi-radii ofblotches, the feature space is measured in pixels, while fan angles are clustered in degrees. Thesize scale of the dark fans and blotches varies significantly between different regions of interest atthe south pole of Mars. Trying to cluster our data with only one value of epsilon , we realized thatit was not possible to simultaneously resolve small markings on the order of 20 pixels properlythat were precisely positioned by the volunteers, while also clustering successfully markings ofmuch larger deposits that could stretch more than half of the Planet Four tile that was shown to thevolunteers. The spread in marking coordinates is smaller for smaller features — we think becauseof an increased focus to detail for smaller features —, and thus, to ensure identification of largefeatures, we implemented a second stage of clustering with larger allowed values for epsilon .The resulting values in Table 3 were selected empirically after review of a random subset of thepipeline output. Fig. 15 shows an example parameter scan review graphic that the science teamused to determine the parameter values that work best for our task.23 igure 15:
This figure shows our review plots for determining the best clustering parameters for Planet Four tile ID . In this example, we reviewthe fan clustering with a group of 2 different min_samples values, controlled by using a min_samples_factor of 0.1 and 0.13 respectively, leading to min_samples values of 5 and 7. Additionally, we are scanning the epsilon (EPS) value for small deposits with the settings 10, 20, and 30 pixels, while the
EPS_LARGE value stays at 25 pixel for these runs (having no effect in this case due to the small size of markings). The upper left 3 plots are for the setting ofMSF=0.1 (resulting in a min_samples value of 5), and EPS between the 10, 20, and 30 pixel values. Then, the second group with an MSF of 0.13 (resultingin min_samples =7), starts in the upper right with the fourth plot in the upper row, and continues in the lower left with the first two plots, again showing thetests for EPS values 10, 20, and 30 pixels respectively. The last two plots in the lower row provide us with what the volunteers actually marked and whatthey received as input for the markings, the Planet Four tile, cut out from the larger HiRISE images. The number of fans clustered varies significantly fordifferent clustering parameter values, with n between 11 and 16. We favor the setting in the upper right plot, for identifying correctly all small center fans,while not creating an object for the small black spot at the top of the image tile. igure 16: This figure shows the final pipeline result of the tile from Fig. 15.
Upper Left : The input tile;
Upper Middle : Fan markings of the volunteers;
Upper Right : Blotch markings of the volunteers;
Lower Right : Blotch markings after clustering and averaging the cluster members;
Lower Middle : Fanmarkings after clustering and averaging the cluster members;
Lower Left : These are the final catalog entries. To reach this, the results from Lower Middleand Lower Right are being compared, and the higher voted markings at comparable locations win. How high that winning ratio must be to be entering thefinal catalog is determined by the threshold value (see Section 4.3.1). Note, how the center fans are cleanly identified and winning in the voting competitionwith the blotch at the same location. The opposite is true for the the small object identified at the middle left, where a red blotch marking has won againstthe small cyan fan. .3. Combination When the direction of fan deposits are not very pronounced, i.e. the prevalent winds were weakat the time of the jet eruption, there is ambiguity in identifying the deposit as a fan or a blotch. Thiscan result in a given ground source having both survived clusters of fan and blotch markings thatneed to be combined in a strategic way to create a final object category for the observed groundsource that will be listed in the resulting object catalog. We make use of the relative frequency ofwhich marking tool was used to create both marking clusters to identify how fan-like a source is.For example, if 5 people classified a marking as a fan, but 5 other people marked it as a blotch, weassign a fan probability P(fan) of 0.51 by applying P ( fan ) = n fans + . n fans + n blotches , with n fans and n blotches the number of volunteers that marked either. The fudge value 0.01 isrequired to be able to make an either-or decision for the object when n fans = n blotches , flippingthe switch in this close call for fans instead of blotches, due to the usefulness of fans for furtherscientific analysis.We determine to which markings this procedure is applied by calculating the pair-wise Eu-clidean distance for all clustered objects and check if clusters are within a chosen limit of 30pixels with each other. We chose this value for allowing slightly more imprecision in the mark-ings’ positioning as the clustering algorithm that went into creating these average, but withoutcombining too many markings that really should be individual items. We have reviewed severalhundred subsets of data and determined 30 pixels to be a good compromise on these competingtasks. If a distance pair meets the combination criterion, we use above formula to calculate P(fan)for this pair of markings. This value goes from 0 to 1 with 0 being a definite blotch when n f ans = n blotch =
0, in other words either none or all volunteers haddrawn a fan or a blotch, respectively. We then create a meta-object for this pair, storing P(fan) un-der the name ‘vote_ratio’ in the catalog files, together with all other data for both objects. We dothis to enable future users of the catalog to decide on their own how reliably a marking is requiredto be a fan before it shall be used as such, with its data entering a study. In other words, a specificstudy might require to only use the most clear fan markings, maybe with a P(fan) of larger than0.8. Applying such a cut is called Thresholding in our pipeline, described in the next section.
For concrete applications, e.g. for this publication, a scientist can now apply a cut on P(fan),that will write out the decision to a new catalog file with fans and blotches. For example, a cut onP(fan) of 0.8 would mean that all meta-objects with a value of smaller than 0.8 will be written outas the underlying blotch, while for meta-objects with a value of larger than 0.8 the stored fan willbe written out. In both cases, the remaining data of the meta-object that was thresholded againstwill be dropped for the newly created catalog file, but it is still available for other thresholdingoperations as an intermediate data product. An example use case would be that a scientist wants tostudy the sensitivity of their research on the applied cut, for example, if we want to provide winddirection data to a mesoscale climate simulation, we might want to make sure that only the mostcertain directions are being used and would apply a higher cut on the meta-object value.26or the catalog that we deliver with this work, we chose a simple majority threshold of 0.5,so that the catalog offers the broadest use case. Choosing simple majority means that we take amarking as a fan from the moment that at least an equal amount of volunteers have classified anobject as a fan and as a blotch. Catalog files with this applied P(fan) threshold of 0.5, all interme-diate data products, and instructions on how to apply a threshold for writing out new catalog fileswill be provided as supplementary products (see Appendix D for more details).27 igure 17:
Three example Planet Four tile pipelines, for APF0000b0t, APF0000ops, and APF0000bk7. See Fig. 16for a detailed description of the pipeline plotting sequence. .4. Ground Projection For each Planet Four tile, the clustering in volunteer-drawn markings to identify seasonalsources is performed using the pixel positions of Planet Four tiles. Once the cluster dimensionsand position has been identified, the source’s true location on the South Pole must be calculated.However, the HiRISE team-generated non-map projected color mosaics the Planet Four tiles arederived from do not contain the spacecraft information necessary to compute the latitude and lon-gitude per pixel. We partially reconstruct the mosaics from the raw HiRISE image products orExperiment Data Records (EDRs) building a red filter only composite image with the necessaryspacecraft information required to perform coordinate transforms. The HiRISE EDRs were ob-tained from the NASA’s Planetary Data System (PDS) HiRISE PDS Data Node. We developed areduction pipeline in Python using the US Geological Survey’s (USGS) Integrated Software forImagers and Spectrometers (ISIS) [Anderson et al., 2004; Becker et al., 2007] and the ISIS-3Python wrapper Pysis for this purpose.We briefly summarize the steps as shown in Fig. 18 including the required ISIS-3 applicationnames, to generate the red filter-only mosaic. We start with the center two RED filter CCDs(RED 4 and 5), each with two readout channels. All four EDR files (2 for each CCD) are readin and converted to ISIS-3 cube format, and the SPICE (Spacecraft & Planetary ephemerides,Instrument C-matrix and Event kernels) information for MRO is added to the EDR headers. Foreach CCD, we combine the two channel EDRs into a single image. The combined image isthen normalized to remove both the striping and left/right normalization effects. This is not anecessary step for obtaining map project information but makes it easier to visually inspect thefinal combined mosaic. Once both CCDs have been reduced they are combined in a final mosaicaccounting for the 48 pixel (in 1 × campt application. The cat-alog tables P4_catalag_v1.0_L1C_cut_0.5_fan_meta_merged.csv — and _blotch_meta_merged.csv respectively —, provided as supplemental files include the cluster coordinates as lat-itude/longitude derived from this process, as well as a set of positional coordinates (X,Y,Z) in thebody-fixed reference frame for Mars, measured in kilometers. http://isis.astrogeology.usgs.gov/ https://github.com/wtolson/Pysis images(2 center CCDs,2 channels per CCD) Add SPICE datahi2isis Stitch channels,creating 1 imgper CCDspiceinitRemove striping andnormalization probs histitchCreate mosaic bymerging 2 remaining center CCD imagescubenorm Translate fan and blotch pixel positionsto lat/lonhandmos Ground coordinatescampt Figure 18:
Process for creating single channel non-map projected mosaics with the required SPICE header infor-mation used to convert Planet Four feature pixel coordinates to geographical lat/lon coordinates. The required ISIS-3applications for each stage are listed in the arrows.
As previously mentioned in Section 2, to avoid edge effects, the cutting down of HiRISE im-ages into screen-sized tiles is performed such that there is a 100-pixel overlap with the neighboringtiles. This way, at least in one of the tiles of an area fans and blotches that cross the boundary be-tween tiles will be visible completely. However, from our own Planet Four marking efforts andfrom analyzing results from Planet Four volunteers, we have determined that the classificationtools do provide such high level of precision in placement, that many volunteers position and pusha fan or blotch marking out of bounds of the shown image area to make it fit a partially shownfan or blotch. This results in several markings for the same object stemming from different PlanetFour tiles, as shown in Fig. 19. It can be seen in this figure that the directions of fans are matching,despite the fact that some tiles only showed a small part of a fan in the overlap area. We henceconclude that a wind direction analysis is not adversely affected by this analysis artefact. For afuture study focusing on area covered by markings and counts of fan and blotch activity, we willimplement a merging procedure to remove multiple markings, similar to the
Combination step inour pipeline, as described in Section 4.3. 30 igure 19:
Six neighboring Planet Four tiles of HiRISE image
ESP_011931_0945 are merged in this plot. The tileshave the following tile coordinates within the HiRISE image and Planet Four tile_ids: Upper Left: (1, 33),b1j; UpperRight: (2, 33), b10; Middle Left: (1, 34), b0p; Middle Right: (2, 34), b20; Lower Left: (1, 35), b0t; Lower Right: (2,35), b0a (all 3 letter tile_ids need to prepend ‘APF0000’ for the full ID). The shape of the tiles are distorted comparedto their displayed on-screen size for this plot. Each tile was clustered individually, indicated by the different markingcolors. The solid lines indicate where an unshared division between the tiles would lie, the dashed lines show theoverlap region that was added to each tile to maximize available information for the volunteers. This plot is instructivein showing how the marked fans, specifically their directions match very well, despite the fact that sometimes onlya very small part of the whole fan marking was visible to the classifying volunteer. For increased precision in totalmarking counts and the area covered by markings we will design an object merging procedure on these overlap regions(next paper). . Data Validation To date, there is no published catalog of the locations and numbers of seasonal defrostingfeatures for any of the HiRISE images of the Martian south polar region to compare to the PlanetFour results. In order to assess the accuracy and recall rate of Planet Four and confirm the majorityof fans and blotches present in the HiRISE observations are identified when combining multipleclassifier markings, we have created a ‘gold standard’ data-set based on expert assessment. Usingthe same classification interface and markings tools on the Planet Four website as the citizenscientists used, the Planet Four Science team reviewed a subsample of the Seasons 2 and 3 tilesand produced a catalog of markings. Similar validation processes have been applied in analyses ofour previous Planet Four publication for the sister project Planet Four: Terrains [Schwamb et al.,2017a] and to crater counting crowd-sourced data for the Moon [Bugiolacchi et al., 2016; Robbinset al., 2014].To generate the gold standard data-set, 960 Season 2 tiles and 767 Season 3 tiles were randomlyselected and equally divided amongst the three of the primary Planet Four Science Team members(GP, KMA, MES) to review. This corresponds to 3 % of the tiles from each season classified onPlanet Four. Additionally another 192 tiles, both from Season 2 and 3, were randomly chosenand classified by all science team gold standard classifiers in order to compare the science teammarkings to each other. This corresponds to approximately 0.4 % of each season’s tiles. The PlanetFour tile_ids of the gold standard classifications and the user names of the science team membersthat did the analysis are provided in supplemental data files
P4_catalog_v1.0_gold_standard_ids.zip . o f t il e s Common Expert data vs Catalog
GPMESKMAcatalog
Figure 20:
Comparing counts of identified objects (i.e. fans and blotches together) per Planet Four tile betweenexperts and the catalog data; here, for the 192 common tile_ids that were classified by all experts. Bin size is 5, eachbin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalogresults (brown). Binning max was cut off at 75, omitting single entry bins above.
We use the expert classifications from the science team with our final catalog in order to explorehow well fan and blotch features are identified and how accurately the shapes and dimensions arerepresented in the Planet Four catalog. We show a tile-based comparison in Section Appendix F.1,32
25 50 75 100 125 150 17510 o f t il e s GPcatalog0 25 50 75 100 125 150 17510 o f t il e s MEScatalog0 25 50 75 100 125 150 175 o f t il e s KMAcatalog
Expert vs Catalog object identification frequency
Figure 21:
Comparing counts of identified objects (i.e. fans and blotches together) per Planet Four tile betweenexperts and the catalog data. Bin size is 5, each bin is directly compared between data from experts (in dark blue)and catalog data (in orange), with the experts GP, MES, and KMA respectively, from top to bottom. Each histogramcontains data for 432 tiles, with each expert classifying an independent data-set. but first we examine the collective properties of the part of the Planet Four catalog that representsthe gold standard tiles. We compare and contrast these distributions to the expert classificationstogether and per expert reviewer.Figure 20 compares the number distribution of identified sources (i.e. fans + blotches) perPlanet Four tile between experts and the catalog data for the 192 common tiles that were commonlyclassified by all three science team members (KMA, GP, MES). Among the expert classifiersthere are some visible differences especially where the interpretation of a single image or twodominates the value of the histogram bin. The final catalog is within the variance of the individualexpert assessments. We can see this further in Figure 21 which shows the number distribution ofidentified objects (i.e. fans and blotches together) per Planet Four tile when comparing the resultsfor the tiles that were only classified by one of the science team members. We note that even tileswith 30 or 40 fans and/or blotches are still well represented in the catalog.33 .2. Fan lengths and blotch areas
Fan lengths [pixel] o f f a n s Fan lengths, common expert data vs catalog
GPMESKMAcatalog
Figure 22:
Comparing measured fan lengths between experts and the catalog data; here, for the 192 tile_ids that wereclassified by all experts. Bin size is 30, each bin is directly compared between the data from all experts GP (blue),MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 600, omitting single entrybins above.
We also use our expert gold standard classifications to examine the physical sizes and arealcoverage of the Planet Four catalog fans and blotches (see Figures 22 to 25). As in previous com-parisons, there is good agreement. The differences between the catalog is within the the varianceseen between the individual expert classifiers. Differences between the catalog and experts be-come more apparent when in small number regimes (when <10 sources comprise the bin). Thesedifferences between the distributions in these small sizes is consistent with small number Poissonuncertainty on the histogram values [Kraft et al., 1991]. Thus, fan length and blotch areas are wellreflected in the Planet Four catalog. 34
100 200 300 400 50010 o f f a n s GPcatalog0 100 200 300 400 50010 o f f a n s MEScatalog0 100 200 300 400 500
Fan lengths [pixel] o f f a n s KMAcatalog
Fans lengths, expert vs catalog
Figure 23:
Comparing measured fan lengths between experts and the catalog data. Bin size is 30, each bin is directlycompared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown).Binning max was cut off at 600, omitting single entry bins above. Blotch area [pixel**2] o f b l o t c h e s Blotch area, common expert data vs catalog
GPMESKMAcatalog
Figure 24:
Comparing measured blotch areas between experts and the catalog data; here, for the 192 tile_ids thatwere classified by all experts. Bin size is 5000, each bin is directly compared between the data from all experts GP(blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at 80,000, omittingsingle entry bins above. o f b l o t c h e s GPcatalog0 20000 40000 60000 80000 100000 12000010 o f b l o t c h e s MEScatalog0 20000 40000 60000 80000 100000 120000
Blotch area [pixel**2] o f b l o t c h e s KMAcatalog
Blotch area, expert vs catalog
Figure 25:
Comparing measured blotch areas between experts and the catalog data. Bin size is 5000, each bin isdirectly compared between the data from all experts GP (blue), MES (orange), KMA (grey) and the catalog results(brown). Binning max was cut off at 120,000, omitting single entry bins above. .3. Wind direction comparison
40 20 0 20 40
Delta mean wind direction per Planet Four tile B i n C o un t s Histogram of deltas between science teamand volunteer mean fan directions.
Fan angle standard deviation per cluster [deg] B i n C o un t s Histogram of angular STD for merged fan clusters
Figure 26: Left:
From the 192 tiles that were analyzed by the science team, 82 resulted in fan catalog entries. Ofthose, we used 39 that had more than 3 fans, for better statistics (the median number of fans per tile is 4, see Section 6).In this histogram, we show the difference between the mean angle of the fans in these 39 Planet Four tiles betweenthe science team and the volunteers. Overall, we have a good agreement, with a few rare outliers, discussed in the textand in Figures 27 and 28. Bin size is 2.
Right:
Standard deviations (STDs) of the directions of fan markings that wentinto each cluster, before they are merged into the average resulting catalog object. This plot shows the distribution ofthese STDs for the set of 192 common gold tiles, which had a total amount of 904 fans. Bin size is 1.
Fig. 26, left, shows a histogram over the differences in the mean-over-tile fan directions be-tween the catalog entries that are clustered from all the volunteers’ markings and the average fromthe three science team members. In general, the agreement is very good, with differences usuallysmaller than 10 degrees. Another way to investigate our uncertainties is to calculate the angularstandard deviation for each cluster member markings that are merged into the final catalog objects,independent on if the markings were done by an expert or a volunteer. Fig. 27 discusses the loweroutlier of, indicating that the respective Planet Four tile has a more difficult than usual scenariowith a naturally occurring higher variance of the actual deposit directions on the ground. Not onlyare the deposit shapes visible in the upper left more irregular than usual, there is a visible gradientof directions across this tile, as can be seen by the exaggerated fan pointers. This gradient is proba-bly caused by the basin shapes in the Inca City region that can create a topographical control of thealignment of fan deposits over the usual wind control. However, our reduction pipeline is reliablyreducing the markings for every deposit, but with higher than usual variance between orientationand size of the markings. Having no single clear fan direction in the image tile, it is reasonable toexpect a higher variance and hence, a higher delta when compared to the 3 science team members.In a similar fashion, Fig. 28 discusses the high-side outlier of Fig. 26. While fans have beenidentified, their counts is low, creating low statistics effects by letting small deviations having alarger effect on the comparison with the catalog data. Additionally, the few fans that are visibleappear to show different directions, leading to a less certain fan direction with a higher variance,which in turn can lead to larger differences when comparing their values, resulting from lowstatistics. 37n Fig. 26, right, we plot the standard deviations for all 904 fan clusters for the 192 commontiles that were analyzed by all experts. The right end of this histogram is cut off by our angularclustering parameter of 20°, meaning larger angular differences are never clustered together. How-ever, the majority of standard deviations lie far below that safety cut-off value for the clustering.We estimate an average uncertainty for our fan directions of about (5 ± In conclusion, our catalog has high completion in most cases. Outliers have been found tobe caused by special circumstances with more challenging classification tasks, creating highervariance for all classifiers, including the experts. The analysis of the gold standard sample demon-strates that the bulk composition of the Planet Four catalog represents a fairly complete picture ofthe seasonal fans and blotches captured in the HiRISE images.38 igure 27:
One of the outliers of Fig. 26, Planet Four tile ID
APF00002aj of HiRISE image
ESP_012744_0985 .The input image shows deposit shapes with less pronounced boundaries, leaking into the background. There is alsoa visible gradient of directions across the tile (visible through the extended fan pointers). See the text for moreinterpretation.
Figure 28:
The highest outlier from Fig. 26, Planet Four tile ID
APF0000c0t , from HiRISE image
ESP_012858_0855 . While fans have been identified, their number is small, increasing the chance for variance between the expertsand catalog data. . Results: Fan and Blotch Catalog From 221 HiRISE images from Mars years 29 and 30, cut up into 42,904 Planet Four tiles,the Planet Four volunteers produced almost 2.8 million fan markings, that were clustered into159,558 fans in our MY29/MY30 catalog. In Table 4 we show an example of fan catalog data.For blotches, 3.46 million raw markings were combined into 250,164 blotches. 29.6 % of theimage tiles (= 12,693) end up not having any clustered markings in our catalog. Fig. 29 showsthe distribution of the fraction of empty tiles per HiRISE image vs. solar longitude. Visual checksof data with fractions above 0.8 confirmed that these HiRISE images are mostly free of CO jetdeposits at spring times; in late summer, however, when the seasonal CO ice layer has fullysublimated,fan and blotch deposits are rendered mostly invisible, because they blend into the nowice-free background. A notable exception to this general effect is the ROI Inca City where thesummer data, after L s ◦ –260 ◦ , regularly shows fan deposits still discernible. This could pointto an interesting difference in the ground soil compactification and its related observed texture.New deposits from CO jet eruptions may be sufficiently different in texture from the backgroundas a result from particle sorting and related phase function changes of the fresher surface.
180 200 220 240 260 280 300
Solar Longitude [ ] F r a c t i on o f e m p t y t il e s pe r H i R I SE i m age Distribution of empty tiles vs time
Figure 29:
Distribution of empty tiles over time, measured in Mars Solar Longitude. Until L s =260° the fraction ofHiRISE images that can be empty varies randomly, reflecting the different ground surfaces imaged across all latitudes.After L s =260° all CO is gone — earlier at lower latitudes —, and most of the HiRISE images appear empty in termsof identifiable blotches or fans, because any deposits blend with the ice-free background. ngle distance tile_id image_x image_y marking_id n_votes obsid spread version vote_ratio x x_angle0 205.56 179.71 APF0000ci9 2270.76 24336.16 F000000 35 ESP_012079_0945 88.03 1 1.00 790.76 -0.901 185.39 179.62 APF0000cia 3391.21 5640.60 F000001 15 ESP_012079_0945 21.35 1 1.00 431.21 -1.002 184.98 500.27 APF0000cia 3509.96 5876.70 F000002 10 ESP_012079_0945 18.91 1 1.00 549.96 -1.003 184.29 105.43 APF0000cia 3716.27 5824.50 F000004 6 ESP_012079_0945 26.41 1 0.68 756.27 -1.004 189.42 109.50 APF0000cia 3452.17 6033.00 F000005 3 ESP_012079_0945 22.58 1 0.51 492.17 -0.995 194.16 335.78 APF0000cib 3565.47 15930.34 F000006 64 ESP_012079_0945 34.93 1 1.00 605.47 -0.976 187.74 183.41 APF0000cib 3143.15 15433.60 F000007 20 ESP_012079_0945 25.68 1 1.00 183.15 -0.997 209.47 179.29 APF0000cid 942.95 22257.99 F000008 58 ESP_012079_0945 49.11 1 1.00 202.95 -0.878 199.91 220.64 APF0000cid 1199.11 21994.01 F000009 54 ESP_012079_0945 35.37 1 1.00 459.11 -0.949 218.88 118.16 APF0000cid 815.95 22539.28 F00000a 42 ESP_012079_0945 49.66 1 1.00 75.95 -0.77y y_angle l_s north_azimuth map_scale BodyFixedCoordinateX BodyFixedCoordinateY BodyFixedCoordinateZ0 224.16 -0.43 214.785 126.856883 0.25 -65.804336 261.407884 -3370.5043451 160.60 -0.09 214.785 126.856883 0.25 -67.219114 257.011589 -3370.6314132 396.70 -0.09 214.785 126.856883 0.25 -67.170611 257.055226 -3370.6307943 344.50 -0.07 214.785 126.856883 0.25 -67.127761 257.024926 -3370.6350024 553.00 -0.16 214.785 126.856883 0.25 -67.169940 257.096267 -3370.6283025 586.34 -0.24 214.785 126.856883 0.25 -66.258570 259.361039 -3370.5712736 89.60 -0.13 214.785 126.856883 0.25 -66.400170 259.284370 -3370.5656667 337.99 -0.49 214.785 126.856883 0.25 -66.296391 261.048812 -3370.4922118 74.01 -0.34 214.785 126.856883 0.25 -66.261274 260.965240 -3370.4971839 619.28 -0.62 214.785 126.856883 0.25 -66.300167 261.124709 -3370.487589PlanetocentricLatitude PlanetographicLatitude PositiveEast360Longitude0 -85.427383 -85.480830 104.1295231 -85.493546 -85.546226 104.6568972 -85.493039 -85.545725 104.6443963 -85.493723 -85.546401 104.6371074 -85.492368 -85.545061 104.6420195 -85.459101 -85.512180 104.3307526 -85.459755 -85.512827 104.3641837 -85.431209 -85.484612 104.2496788 -85.432730 -85.486115 104.2468139 -85.429945 -85.483362 104.246483 Table 4:
First ten lines of the fan catalog file
P4_catalog_v1.0_L1C_cut_0.5_fan_meta_merged.csv , broken into three segments. igure 30: Planet Four tile
APF00006mr from HiRISE image
ESP_011296_0975 has the highest number of resultingfan entries per tile. Top: Input tile as seen by volunteers; Bottom: Overlaid clustering results from the catalog. igure 31: Planet Four tile
APF00007t9 from HiRISE image
ESP_012604_0965 has the highest number of resultingblotch entries per tile. Top: Input tile as seen by volunteers; Bottom: Overlaid clustering results from the catalog. .1. Catalog properties6.1.1. Fan counts The highest counts of fans and blotches were 167 fans in the tile_id
APF00006mr and 278blotches in the tile_id
APF00007t9 , shown in Figures 30 and 31. These data serve as an indicationof the dedication of the Planet Four volunteers producing results in such high spatial density. Themedian count of fans and blotches per tile is 4. The distribution of both numbers is shown inFig. 32.
As an example of the possibilities of the produced catalog, we describe the measured fanlengths in the catalog. The catalog column distance requires scaling by the values in map_scale ,to correct for the different HiRISE binning modes. The distribution of these measurements areshown in Fig. 33. About 97 % of all fans are below 100 m in length, with a median value of 24 m.The three largest fans measured are all from the same ROI called Manhattan Classic (Lat−86.39°, Lon 99°), having lengths of 373 m, 368 m and 361 m respectively. They were identifiedin the HiRISE images
ESP_013095_0935 (longest) and
ESP_011961_0935 (second and third).The two longest fan markings even identify the same fan, but at different times in the season,with the longest observed at L s =265°, and it’s shorter self at L s =209°. Being only 5 m different,we attribute the increased marking measure to both material being potentially moved around bywinds during spring and a decrease of precision in identification after the CO has sublimed andthe deposits start to fade into the background. However, we interpret the fact to have identifiedthe largest fan twice, as a further indication of the high reliability of our results, consideringthat the random image serving procedure of the Planet Four classification interface ensured thatvolunteers do not classify images in the order they have been taken, because that would haveincreased the chances of being biased by their previous classification. In this case, where 119volunteers classified APF0000dtk with the longest fan, and 54 volunteers classified
APF0000de3 with the second longest fan (shown in Fig.14), only one volunteer was identified to be the same.An overview of the fan lengths distributions for all major ROIs over all 2 Martian years of datais shown in Fig. 34. When compared between Mars years 29 and 30, the total (over all ROIs) fanlength statistics are very comparable, with a median of 24.2 m for MY29 and 23.8 m for MY30.However, we identify specific ROIs that have different fan properties between MY29 and 30. Forexample, the ROI
Manhattan has a median fan length of 42 m in MY29 and a decreased median of25 m in MY30. This is in contrast with the ROI
Giza , where the trend has the opposite direction,with a median fan length of 44 m in MY29, comparable with
Manhattan’s median in the sameyear, but then increases to a median of 59 m for MY30. Meanwhile, in ROI
Ithaca , both years arevery similar, with median fan lengths of 39.5 m and 38.6 m respectively.44
50 100 150 200 250
Markings per tile marking = blotch Markings per tilemarking = fan
Figure 32:
Count distributions of catalog objects per tile, blotches on the left, fans on the right. The bin size is 5counts in both plots.
Fan length [m] Normalized Log-Histogram of fan lengths
Fan length [m] F r a c t i on o f f an s w i t h g i v en l eng t h Figure 33:
Normalized histograms of all fan lengths in the catalog. Left: Log-Histogram, Right: CumulativeHistogram. The median (i.e. fraction of 0.5) value is at 24 m, with 82 % of the fans shorter than 50 and 97 % shorterthan 100 m, as indicated by the lines in the plot.
50 100 150 200 250 300 350 distance_m
MaccelsfieldStarburstManhattanBilbaoIthacaPortsmouthManhattan_FrontinellaBuenosAiresIncaGizaPotsdamOswego_Edge r eg i on Fan lengths in different ROIs
Figure 34:
Boxplot showing the distributions of fan lengths for our set of regions of interest at the Martian southpole, over both Martian years of data, MY 29 and 30 (Inca City and Inca City Ridges are combined here due totheir proximity). The boxplot setup uses the standard setup of interquartile range (IQR) for the box and its whiskersextending to 1.5xIQR, single dots for outliers. . Wind Direction Results from Four Sample Regions of Interest (ROIs) Early in the mission, HiRISE has defined several regions of interest (ROIs) within the southernpolar areas that have been extensively monitored for seasonal activity ever since (the list of originalseasonal ROIs can be found in Hansen et al. [2010]). We have selected a sub-set of these ROIs tobe analyzed by Planet Four, as shown in Table 1). The map of ROIs’ distribution over the pole isshown in Fig. 4.Below we will focus on 4 example ROIs to showcase the use of Planet Four data catalogand our ability to monitor wind directions using fan markings positions and locations. We havepicked these 4 ROIs (informally named Ithaca, Giza, Manhattan, and Inca City) for regional casestudies of the seasonal winds because the temporal coverage over these locations is the highest. Wedescribe each ROIs’ general settings and geomorphology based on observations of HiRISE and ourprevious works [Hansen et al., 2010; Pommerol et al., 2011]. We then present the wind directionmaps over spring season at each of these locations. The wind rose diagrams for each HiRISE imageseparately are available in the supplementary files
P4_catalog_v1.0_wind_rose_diagrams.pdf
The Ithaca region is located at southern latitude 85.2 ◦ , eastern longitude 181.4 ◦ . This lo-cation is away from the permanent polar cap, at the edge of the cryptic region and situatedon a surface that is relatively smooth on a large scale: the digital terrain model produced byHiRISE ( DTEPD_040189_0950_040216_0950_A01 ) shows vertical elevation variations less than60 m across the Ithaca region. At the same time, on the meter scale the surface in Ithaca is rough,showing irregular and uneven bumps and pits. No araneiforms (i.e. radially-organized channels)were detected Ithaca according to HiRISE imaging, while rare isolated troughs and patternedground similar to araneiform troughs are present [Hansen et al., 2010].During local spring, fan-shaped deposits densely cover the Ithaca region (see an example inFig. 35. Opening angles and lengths of the fans were reported to evolve during spring while thenature of these changes was not quantified [Thomas et al., 2010]. Multiple fans were observedto emerge from the common vents, at times merging together to create a wider singular fan. Thedirections of the fan deposits were noted to be consistent from one Martian year to another withonly little variation.An interesting detail about Ithaca is very prominent bluish halos and fans that are repeatedlyobserved here [Thomas et al., 2010]. In contrast to the more common dark fan-shaped deposits,these halos and fans have higher albedos, approaching the albedo of fresh ice deposits. In Ithacathey are also distinctively bluer than the rest of the surface. There are at least two types of suchbright deposits. One type resembles narrow fans that are located centrally over the older dark fans.These appear early in spring, before L s = 190°. The other type resembles halos contouring thepre-existing dark fans. They appear on average later than the narrow bright fans.In summer (L s > 270°), the seasonal deposits are mostly invisible in Ithaca. Partially, this isbecause the low scale roughness creates a patchy-looking environment with pits being darker thanbumps either due to shadows or dust collecting in depressions.47ig. 35 shows a typical plot that we will use to analyze derived wind directions in our ROIs.This particular plot was created from Planet Four data for one HiRISE image ( ESP_011931_0945 )taken in Ithaca at L s = 207°. To create this plot we took all the fan markings over the HiRISE imageand plotted it as a histogram of their directions (top right panel of Fig. 35). Note that, in contrast tothe standard wind rose diagrams showing the directions of the origin of winds, we use this diagramto show the measured deposition directions caused by the winds, i.e. the opposite from the windorigins. We decided for this kind of display because it relates more to the actual measurementsperformed by the Planet Four project and does not imply any interpretation. The fan direction iscounted clock-wise (CW) from the North Azimuth (NA) direction, where 0° always representsNorth, and 270° West. The histogram is not scaled, i.e. the y-axis shows the actual counts ofthe fan markings with the direction of each bin in the x-axis. The maximum of the histogram isthe most probable direction for the markings and the width indicates how variable the directionsof the markings are for this particular L s . The default size of each histogram bin is 3.6 ◦ . Inexceptionally rare cases for a particular image the number of fan markings and thus number ofwind measurements are low. Such cases require special treatment and increase in bin size. Onthe top left panel the same data are plotted in the wind rose diagram. This time the histogram isnormalized to highlight the difference in directions if several HiRISE images are plotted in thesame frame. Note that the position of zero (NA direction) depends on the location of ROI, i.e.the wind rose diagram is map-projected to the location of the data plotted. Thus, the directionof the fans can be directly compared to the map-projected HiRISE image (bottom panel). In thisparticular example one can see that the histogram has 2 peaks that indicate there are two distinctdirections of the fans. This can either be (1) because of overlapping fan deposits from jets thaterupted from the same vents at different times prior to L s =207.8 ◦ under different wind regimes; or(2) because different areas of the ROI have distinctively different wind regimes. In this examplecomparing the derived fan directions to the sub-frame of the HiRISE image indicates that the firstcase is more probable.Fig. 36 shows directions of the fan deposits in Ithaca as retrieved by the Planet Four project fortwo Martian years: MY29 and MY30. We have separated the spring season into early spring, i.e.before L s =210 ◦ , and late spring, from L s =210 ◦ to L s =270 ◦ . The panels in this figure are organizedin the way that columns show separation into early and late spring while rows show MY 29 andMY 30.Ithaca fans sustain the same direction towards ≈ ◦ in the early spring and the maximum shift is 25 ◦ over the whole season in MY29. Histogramswiden with increase of L s and sometimes develop double maxima indicating more variability inthe marked fan directions. This is also reflected in the increase of the standard deviation towardsthe end of spring. It can be attributed to larger wind variability later in spring or that winds becomestrong enough to lift the particles from the ground at times between jet eruptions. Over-all MY30show similar behavior to MY29. 48 igure 35: Fan directions in Ithaca region at L s =207.8 ◦ (top) and a subframe of HiRISE image ESP_011931_0945 that can be directly compared to the wind rose diagram in the top left panel. igure 36: Direction of fan markings in Ithaca region for early and late spring of MY29 and MY30.
The Giza region is at southern latitude 84.8 ◦ , eastern longitude 65.7 ◦ . It is located closer tothe edge of the permanent cap than Ithaca. It is also near a trough with exposure of southern polarlayered deposits while the area of Giza is flat on km-scale (see HiRISE DTM DTEPC_004736_0950_005119_0950_A01 ). On the smaller scales, as can be seen in multiple HiRISE images takenover this area (including those that were input for Planet Four) the region is covered in modulatedbumps and small ripples. One side of this ROI is covered in yardangs.Very large and very intricate araneiform structures are located in this region. Their troughs arenarrow, long, with high degrees of branching. These araneiforms are very active in spring: multiplelong and narrow fans emerge from their troughs and cover an extended area. HiRISE detected adusty reddish haze over the araneiforms in Giza in several years indicating active loading of dustinto the lower layer of atmosphere. The directions of the fans in the late spring were previouslynoted to co-align with yardangs, suggesting that the wind regime in this area in summer stayedstable for an extended period of time Hansen et al. [2010].Similar to Ithaca, in Giza we do not observe significant differences in fan directions betweenMY29 and MY30 (Fig. 40 lower left panel and Fig. 37). Early images taken before L s =190 ◦ showvery narrow histograms with a maximum between 300 ◦ and 310 ◦ . The maximum, which marksthe direction of most fans, slowly shifts towards 360 ◦ . The shift rate is higher than in Ithaca (> 45 ◦ igure 37: Direction of fan markings in Giza region for early and late spring of MY29 and MY30. HiRISE imagesused:
ESP_011447_0950 , ESP_011448_0950 , ESP_011777_0950 , ESP_011843_0950 , ESP_012212_0950 , ESP_012265_0950 , ESP_012344_0950 , ESP_012704_0850 , ESP_012753_0950 , ESP_012836_0850 , ESP_012845_0950 , ESP_020150_0950 , ESP_020401_0950 , ESP_020480_0950 , ESP_020783_0950 , ESP_020902_0950 , ESP_021482_0950 , ESP_022273_0950 . over the whole spring). The number statistics of fan detection worsens in the late spring in bothyears, but it is particularly noticeable in late spring of MY30 (see histograms for the late spring ofMY30). This is explained by decreasing contrast between the fan deposits and undisturbed surfacearound fans in late spring images, i.e. the fans blending in with their environment. The Manhattan region is in a very active area with at least 3 HiRISE ROIs that once were allconsidered under this same name. This area is around southern latitude 86 ◦ , eastern longitude 99 ◦ ,as the two above, this is on the edge but still inside the cryptic region. The ROI is located on theeastern side of a South Polar Layered Deposit (SPLD) trough that in spring is completely coveredwith seasonal activity. The area is inclined towards the trough, i.e. in the north-west direction,however, rather insignificantly. According to the HiRISE DTM ( DTEPC_022259_0935_022339_0935_A01 ), there is a 270 m elevation change over approximately 8 km ( ≈
2° slope).Manhattan is covered in well developed interlaced araneiforms. Similar to Giza, the araneiformshere have thin and long troughs and branch significantly. Aside from araneiforms, the surface inManhattan is smooth, even on tens to hundreds meters scales with just several exceptions of shal-low irregular pits.Seasonal activity is extensive in Manhattan, with dark fan deposits that at times develop bright51 igure 38:
Direction of fan markings in Manhattan region for early and late spring of MY29 and MY30. halos. Intriguingly, araneiforms’ troughs become visibly brighter compared to the rest of surfacearound L s = 200 ◦ and stay bright almost until the region completely defrosts.Fans in Manhattan are directed 230 ◦ from NA direction in the beginning of spring as shown bythe first observations of both analyzed years. This direction shifts during early spring and plateausat 290 ◦ after L s =220 ◦ . Inca City is at latitude 81.3 ◦ , eastern longitude 295.7 ◦ ; relative to the aforementioned ROIs itis on the opposite side of the permanent cap and the southern pole. The topography of this locationis the most complex in our list (HiRISE DTM DTEPC_022699_0985_022607_0985_A01 ). It is asystem of over 300 m-high ridges that crisscross each other at almost right angles forming close-to-rectangular basins. The slopes of the ridges sometimes exceed 13 ◦ providing a variety of insolationenvironments in a relatively small region. The inner surface of the basins is flat and most ofaraneiforms of Inca City are carved in it. The formation of the Inca City ridge system is debatedbut most commonly attributed to the interaction of irregularities of the local crust with an impact-induced compaction wave [Kerber et al., 2017].Araneiforms in Inca City are morphologically different from those in Giza and Manhattan.They have a well-developed central depression with relatively short troughs extending outwardsand are on average smaller.Seasonal activity in Inca City starts at the slopes of the ridges [Thomas et al., 2010]. Fandeposits extend downwards following gravity lines. The fans are very narrow but do not have any52eatures of the flows (dark flows come later in spring). It is not fully clear if the fans are directedby the gravity or by downslope winds in this ROI. The surface around and near araneiforms, in thebasin floor, gets covered mostly in blotches suggesting that no significant winds are active insidethe basins.Directions of fans in Inca City are seemingly disordered, particularly in comparison to the 3ROIs discussed above. However, Inca City is special in this set because it has prominent topogra-phy that the other 3 ROIs lack. Thus the analysis method that works well for our other ROIs mightnot be applicable to Inca City. Inca City ridges affect the local deposition of solar energy and in-fluence near-surface winds. Directions of fans in Ithaca, Giza, and Manhattan are modified by nearsurface winds that normally pass undisturbed over the whole ROI. In contrast, in Inca City fans areobserved almost exclusively on the slopes of the ridges and are aligned with down-slope direction.However, these fans appear on the slopes gradually through spring: the first fans according to ouranalysis are pointing to the south-west direction (270 ◦ from NA), i.e. located on south-west facingslopes. Early observations have the smallest standard deviation indicating smallest variation inthe fan directions (Fig. 39). However, even in the early histograms several local maxima may bedetected. The location of the secondary maxima are determined by the slopes that were coveredby HiRISE image at each L s . Later in spring the fans start to appear on the slopes with a differentorientation than to the south-west. This widens the histogram for each HiRISE image and makesthe location of the histogram maximum a less and less relevant measure of the mean fan direction.This results in the larger variation of the mean fan direction and large standard deviations (bottomright panel of Fig. 40). Local maxima repeatedly occurring at the same directions from image toimage in late spring and the whole scenario repeats in both years with only small variations.53 igure 39: Direction of fan markings in Inca City region for early and late spring of MY29 and MY30. igure 40: Direction of fan markings in 4 ROIs vs L s for MY29 and MY30. Directions are plotted in degrees relativeto NA direction. Error bars represent the standard deviation of the data and not the error on the mean. Prevailing windscontrol direction of fans in Ithaca, Manhattan, and Giza because the over-all topography in these ROIs is smooth andhas no obstacles significantly modifying the winds. In Inca City, however, the topography is more prominent, with3 km-high ridges that break down the general winds and support creation of katabatic flows. Thus, the fans herefollow slopes of the ridges rather than wind direction, which is reflected in the large scatter of mean fan direction andlarge standard deviations on mean fan direction. . Conclusions The Planet Four project has produced a catalog of 158,476 fans and 249,801 blotches (ellipses),identifying locations of seasonal surface deposits produced by the CO jet processes occurringduring spring in the Martian south polar region. The catalog was generated by combining theassessments made by Planet Four volunteers reviewing a set of 42,904 tiles derived from 221HiRISE observations obtained over 2 Martian Years, covering a set of 28 regions of interest (ROI)across the south pole. To date, this catalog serves as the largest reporting of locations, sizes, andmapping of seasonal deposits on the Martian surface. The Planet Four fan and blotch catalogconstitutes a resource for studying polar winds, climate and polar processes. Using south polarfans as regional wind markers, the Planet Four catalog can provide tests for and input to globaland regional atmospheric circulation models.Statistical comparisons between classifications produced by the science team and catalog re-sults for the same image data (Section 5) demonstrate that the bulk composition of the PlanetFour catalog represents a fairly complete picture of the seasonal fans and blotches captured in theHiRISE images. Trend consistency for fan directions between Mars Year 29 and 30, despite thefact that most data is being analyzed by different volunteers, further indicates reliability of themethods presented here (see summary Figure 40). We have gone into considerable detail on themethodology behind the data in the catalog and are confident that its content can be productivelyused by our colleagues for their own research.For 4 of the 28 ROIs we have presented mean fan directions. In three of these, the fan depositsappear to be directly modified by near-surface winds at the time of jet eruption; the fourth ROIshows the strong influence of topography. In ROIs Ithaca, Giza, and Manhattan: The derived meanwinds show no significant inter-annual variability between MY29 and MY30: their direction at thesame L s are the same with less than 10° variations. In Inca City: The mean direction of the fanscoincides with the direction of slopes and changes over spring while more slopes become exposedto sunlight and cold jet eruptions happen.Our analysis in this paper focused on HiRISE observations from seasons 2 (MY29) and 3(MY30) of the HiRISE southern seasonal processes campaign, and research into inter-annual vari-ability starts to be feasible. However, the HiRISE campaign covers now 6 seasons of monitoring,and for a number of selected ROIs 5 of these have been or are being analyzed by the Planet Fourproject at the time of writing. The results from the analysis of these longer timespans and addi-tional areal coverage will be topics of future publications and data releases. Acknowledgements
The data presented in this paper are the result of the efforts of the Planet Four volunteers, gen-erously donating their time to the Planet Four project, and without whom this work would not havebeen possible. Their contributions are individually acknowledged at . Additionally we thank all those involved in BBC Stargazing Live 2013. This pub-lication uses data generated via the Zooniverse.org platform, development of which was supportedby the Alfred P. Sloan Foundation. The authors also thank Chris Lintott (University of Oxford),who had to decline authorship on this Paper. We thank him for his efforts contributing to thedevelopment of the Planet Four website and for his useful discussions.56ES is currently supported by Gemini Observatory, which is operated by the Association ofUniversities for Research in Astronomy, Inc., on behalf of the international Gemini partnershipof Argentina, Brazil, Canada, Chile, and the United States of America. MES was also supportedin part by an Academia Sinica Postdoctoral Fellowship and by a National Science Foundation(NSF) Astronomy and Astrophysics Postdoctoral Fellowship under award AST-1003258.CM was supported by the 2014 Institute of Astronomy and Astrophysics, Academia Sinica(ASIAA) Summer Student Program. KMA and MES also thank the attendees of the Workshopon Citizen Science in Astronomy for the insightful conversations and acknowledge ASIAA andTaiwan’s Ministry of Science and Technology (MOST) for supporting the workshop. The authorsalso thank Greg Hines, Cliff Johnson, Margaret Kosmala, Chris Schaller, Brooke Simmons, andAli Swanson for insightful discussions.This work is also partially enabled by the National Aeronautics and Space Administration(NASA) support for the
Mars Reconnaisance Orbiter (MRO) High Resolution Imaging ScienceExperiment (HiRISE) team. This paper includes data collected by the MRO spacecraft and theHiRISE camera, and we gratefully acknowledge the entire MRO mission and HiRISE teams’efforts in obtaining and providing the images used in this analysis. The Mars ReconnaissanceOrbiter mission is operated at the Jet Propulsion Laboratory, California Institute of Technology,under contracts with NASA. The authors also thank Rod Heyd for guidance in extracting thegeographic and location information for HiRISE non-map projected image. This research hasmade use of the USGS Integrated Software for Imagers and Spectrometers (ISIS) and of NASA’sAstrophysics Data System.KMA and GP were supported for this work by NASA ROSES Solar System Workings grantNNX15AH36G.All software created for the pipeline is based on the open source language Python, using the matplotlib library [Hunter, 2007] for plotting, the pandas library for data wrangling and analysis[McKinney, 2010], the scikit-learn library [Pedregosa et al., 2011] for the clustering of Planet Fourmarkings and other pre- and post-processing tasks, the
IPython and
Jupyter system for everdaycomputing [Perez and Granger, 2007], and the
SciPy tools on a daily basis [Jones et al., 2001].57 ppendix A. The Zooniverse’s Ouroboros Web Plateform
In this Section, we briefly describe the Zooniverse’s Ouroboros web platform and describehow it interacts with the Planet Four classification interface. The Planet Four website and theOuroboros platform are both hosted on Amazon Web Services. This enables the ability to rapidlyscale up the number of servers based on the demand on the site, including handling the largenumber of classifiers during Stargazing Live 2013. The Planet Four classification interface isa JavaScript and coffee script application that presents the classifier with the HiRISE tile andenables the volunteer to draw markers on the image and submit them for storage in the Planet Fourclassification database. The Zooniverse’s Ouroboros platform, written in Ruby on Rails, handlesthe back end storage of classifications in a Mongo database and determines the next tile that shouldbe sent to a given Planet Four classifier for review.Active tiles are shown to 30–100 classifiers before being retired from rotation. Once a classifi-cation is complete, the Planet Four interface sends the information via the Ouroboros ApplicationProgramming Interface (API) to be stored in the database and to update the classification countfor the respective tile. If the activity on the website is low, this step is done immediately. If sitetraffic is high, for example 70,000 people on the website at once (such as during launch of theproject), Ouroboros is designed to queue the classifications and store them asynchronously to thedatabase so as not to impair the speed and performance of the Planet Four website. In this casethe classification counts for the tiles and the list of tiles a registered or non-registered classifier hasseen is not updated in live timeThe Planet Four web interface queries the Ouroboros API to identify the next tile to presentto a classifier. Ouroboros checks the database and selects a random active tile that has not beenpreviously reviewed by the Zooniverse registered user or non-logged-in session. At any giventime, Ouroboros readies a list of 5 tiles that the classifier has not seen. When presented with arequest to see another image, the next in this list is sent back by the API. Typically this meansthe classifier rarely if ever is presented with a tile to review twice. We note there was a bug inOuroboros at launch that made repeats more prevalent. In Appendix B we describe our methodsto cleanse duplicate and spurious classifications from our final data reduction.We note that in the Zooniverse’s Ouroboros framework, refreshing the Planet Four interface inthe browser will result in a new tile being selected and displayed without updating the classifica-tion database. Refreshing the browser is just as easy as hitting the ‘Finished’ and ‘Next’ buttonsto move on to a new image, so we do not believe this has any significant impact on classifier be-havior. We mention it for completeness only. Also for the majority of the Season 2 and Season 3classifications, a memory leak in the drawing library would cause the web browser to crash aftera rather large number of fans were drawn in the image (approximately over 30–50 sources). Thisimpacted a very small fraction of tiles.
Appendix B. Handling of Duplicate or Spurious Classifications
With the Zooniverse Ouroboros queuing system (described in Appendix A), it is possible thata duplicate classification may occur, but these instances should be rare. A software bug in theOuroboros platform caused a number of classifiers to receive the same cutout they had previously58lassified before. Duplicate classifications are only a small portion of the data-set, comprising1.9 % percent of all classifications produced, and typically, a few classifications or less per PlanetFour image tile were duplicates in those cases.In order to treat each classification as an independent assessment, we removed all duplicateclassifications, keeping only the first response for a given registered user/non-logged-in sessionfor a given cutout.We also found a concentration of markings positioned at the top left corner (x=0, y=0) of themarking interface, with nearly all having default values for the other recorded parameters. Only0.12 % of the 9,631,517 markings recorded for Seasons 2 and 3 are effected. Further investigationshows that less than 7 % of fan and blotch markings with default parameters with x=0 or y=0 arenot centered at the origin. Thus, we believe these origin default-valued markings are due to ajavascript error. Therefore, we simply delete them from the database, but keep any other markingsassociated with those effected classifications. Additionally 33 markings ( ∼ Appendix C. Raw Classification Data
Here we provide additional details about the raw classification data provided in the onlinesupplementary data file . It is written in the binary HDF5 format, in the variant produced by the pandas library (supported by the PyTables library ).The general structure is as follows: Each classification submission by an individual volunteercreates a classification_id . All objects created by this volunteer receives the same classification_id ,with the marking data for each object being one entry in the classification database. Each data rowalso has a marking column that identifies if this data is for a fan, a blotch, an interesting featurethat will have the string value “interesting” in the marking column, or “none”, when the volunteerdid not create any marking object. Below we describe the columns available in this database: Column name Example value Description classification_id 50ecaaf760d4050d21000414 Unique ID for each classification bya Planet Four volunteercreated_at 2013-01-08 23:25:43 time of submissiontile_id APF0000p9t Planet Four tile identifier http://pandas.pydata.org/pandas-docs/stable/io.html URL to image data for this PlanetFour tileuser_name abc Originally, the Zooniverse usernameor non-logged-in session ID. For pri-vacy concerns, we have convertedthese to anonymous IDs.marking blotch identifier for what data in row is for:blotch, fan, interesting, nonex_tile 1 x coordinate of tile inside largerHiRISE image frame. Starts at 1 inupper left of the HiRISE image, in-creases to the right.y_tile 2 y coordinate of tile inside largerHiRISE image frame. Starts at 1 inupper left of the HiRISE image andincrease downwards.acquisition_date 2011-01-01 00:00:00 date only for HiRISE observationtime (ignore hours)local_mars_time 5:43 PM local mars time for given acquisitiondatex 553.65 x pixel coordinate of object in PlanetFour tile. Starts at 0 in upper left, in-creases to the right.y 355.817 y pixel coordinate of object in PlanetFour tile. Starts at 0 in upper left, in-creases downwards.image_x 2033.65 x pixel coordinate of object in origi-nal HiRISE image. Starts at 0 in up-per left, increases to the right.image_y 37071.8 y pixel coordinate of object in origi-nal HiRISE image. Starts at 0 in up-per left, increasing downwards.radius_1 295.195 Semi-major axis of blotch object inpixels. NAN if not applicable (N/A)radius_2 294.715 Semi-minor axis of blotch object inpixels. NAN if N/Adistance NaN Length of fan object in pixels. NANif data row is for blotch or interesting60ngle 27.4331 Orientation of marking object withrespect to tile image x-axis in de-grees. Positiv clock-wise, zero to im-age right (same definition as HiRISE)spread NaN Opening angle of fan objects in de-grees. NAN if N/Aversion NaN version of tool used to create fan.NAN if N/Ax_angle 0.887549 cartesian x coordinate of angle col-umn on unit circley_angle 0.460713 cartesian y coordinate of angle col-umn on unit circleThe Planet Four classification interface recorded a different angle than the intended spreadangle from the fan marking tool. This was identified and subsequently fixed in the software.The correct spread angle is recoverable from the values stored in the database. We denote thosemarkings generated before the patch with version flag set to 1.0 and those after with the versionflag set to 2.0. We provide the corrected spread angle for the fans affected, but leave that versionflag in the final catalog, for reference. To gather statistics on the understanding of the tutorial,the Planet Four classification database contains all the tutorial markings, indicated by a HiRISEimage name of ‘tutorial’. For the delivered raw classification database, the fan angles range hasbeen converted from -180–180 to 0–360, while the range of the blotch angles have been convertedto 0–180, due to their rotational symmetry.
Appendix D. Pipeline outputs
The intermediate stages of the pipeline, as output by our clustering and combination pipelineare identified with different level identifiers 1A, 1B, and 1C, indicating different stages of theprocessing pipeline, where the processing is done on a per-tile-id level. After this is done, the finalstep of combines all the data from the ten-thousands of tile_id folders into a set of summarizingCSV files.
Appendix D.1. Directory file structure
The directory file structure of the pipeline products are as follows (examples in parentheses):• HiRISE observation ID (
ESP_011350_0945 ) – Planet Four tile ID (
APF0000any )* Level 1A (
L1A/APF0000any_L1A_fans.csv )* Level 1B (
L1B/APF0000any_L1B_fnotches.csv )* Level 1C with cut value 0.5 in directory name (
L1C_cut_0.5/APF0000any_L1C_cut_0.5_blotches.csv )with the list of HiRISE observation IDs identifying the HiRISE observations that went intoPlanet Four for this database. 61 ppendix D.2. Pipeline stage levelsAppendix D.2.1. Level 1A
Level 1A is the data that is directly output from clustering and averaging the cluster membersinto average markings, as described in Section 4.2. Here, the biggest reduction in terms of numbersof objects in the system occurs, as all the different volunteers data are being combined into oneobject when the clustering process has determined the markings to be part of one cluster. Allnewly created average fans and blotches are summarized into one fan and blotch summary filerespectively, which each line representing the mean object from averaging all cluster members.As an example, the content of
APF0000p3q_L1A_fans.csv is shown below. When the columnname matches those given in Appendix Appendix C, they have the same meaning. The two newcolumns are n_votes , which records how many members the cluster had that was used to producethis averaged object, and marking_id , which have been created at this stage of the pipeline andserve as a tracer throughout the different pipeline outputs:x_tile y_tile x y image_x image_y radius_1 radius_20 2.0 26.0 123.611111 455.666667 863.611111 14155.666667 NaN NaN1 2.0 26.0 157.000000 391.800000 897.000000 14091.800000 NaN NaNdistance angle spread version x_angle y_angle n_votes image_id0 81.884266 223.712817 71.559689 1.0 -0.691035 -0.660663 9 APF0000any1 57.742472 248.754137 52.521798 1.0 -0.360802 -0.927999 10 APF0000anyimage_name marking_id0 ESP_011350_0945 F006de31 ESP_011350_0945 F006de4Additionally, each L1A folder contains a text file called clustering_setttings.yaml thatsummarizes the clustering settings used for these data for reference. epsilon values are static andall the same, but the min_samples value is dynamically calculated, see Section 4.2.1 for details.
Appendix D.2.2. Level 1B
At level 1B, the combination pipeline has determined with objects are so close to each otherthat they should be considered for merging (see Section 4.3). The outputs are between one andthree files this time. One only, in case all fans and blotches found were so close that they needto be evaluated by their classification votes. Usually, though, there are two to three files, whereone files stores the objects that need voting, and the other file(s) store the objects that don’t haveany close neighbors and will simply be copied over to the final level later. The fans and blotchesin these latter files will receive the ‘vote_ratio’ value of 1.0, indicating that they had a “perfect”probability for being a fan, or blotch, respectively. The third file that keeps the close objects forthe later thresholding contains these temporary meta-objects in sets of 2 rows, one fan and oneblotch, and has the term “fnotch” in its filename (fnotches: FaN–blOTCH). This file contains allthe clustering statistics data from L1A required to make a cut decision for L1C, with the data for62ach meta-object being sorted in alternating rows. Here are the first four rows of the fnotch file
APF0000any_L1B_fnotches.csv :angle distance image_id image_name image_x image_yfan 223.712817 81.884266 APF0000any ESP_011350_0945 863.611111 14155.666667blotch 67.261720 NaN APF0000any ESP_011350_0945 838.395834 14123.875000fan 247.146845 58.742330 APF0000any ESP_011350_0945 832.000000 14306.400000blotch 70.684606 NaN APF0000any ESP_011350_0945 821.666667 14281.428571marking_id n_votes radius_1 radius_2 spread version x x_anglefan F006de3 9 NaN NaN 71.559689 1.0 123.611111 -0.691035blotch B0071f2 8 49.309277 36.981958 NaN NaN 98.395834 0.379131fan F006de5 5 NaN NaN 81.171448 1.0 92.000000 -0.387419blotch B0071ed 7 35.324591 26.493443 NaN NaN 81.666667 0.217508x_tile y y_angle y_tile vote_ratiofan 2.0 455.666667 -0.660663 26.0 0.539412blotch 2.0 423.875000 0.907431 26.0 0.460588fan 2.0 606.400000 -0.919245 26.0 0.426667blotch 2.0 581.428571 0.852341 26.0 0.573333This data stage L1B is what can be used to create a different significance threshold cut for thefinal data , by filtering on the data column vote_ratio in the fnotch file for the required thresholdvalue. For example, if a higher threshold on the probability for a fan is wanted, e.g. 0.8, one wouldfilter out all rows that start with “fan” with a vote_ratio value below 0.8. One then needs to decideif one wants to use this threshold as a general “certainty” filter and simply don’t take any objectwith a vote_ratio < 0.8, or if one wants the blotch to appear instead of a fan.
Appendix D.2.3. Level 1C
This level contains the data of the final catalog files, but split-up into each Planet Four tiles.At the end of the thresholding stage (Section 4.3), appending the data for the rows that pass thethreshold filters into the respective blotch and fan files and copying these completed files into theL1C directory completes that thresholding step and fills up the L1C folders. A final tool walksthrough each folder and collects all the fan and blotch data into one summary file each, followedby merge operations with meta-data that is useful for future analysis. These files are described inthe next section, Appendix E. 63 ppendix E. Planet Four Catalog files description
Our catalog product files consist of one CSV result file per fan and blotch markings, a PlanetFour tile meta-data file, and a HiRISE observation meta-data file. Below, each subsection describesthe data columns for these files.For convenience we provide both the planeto-centric and planeto-graphic latitudes for eachfan’s base and blotch’s center point. Longitudes are measured 0–360, increasing positive to theEast. Note that, because the HiRISE images were not co-registered, the conversion of pixel togeographical coordinates can be offset by up to 100 HiRISE pixels between data from differentHiRISE images.
Appendix E.1. Fan catalog
Column name Examplevalue Description marking_id F00004ab Consistent identifier for marking after clustering.Fxxx=Fan, Bxxx=Blotchangle 185.4 Alignment angle of marking measured from 3o’clock direction, clockwisedistance 179.6 Length of fan in pixelstile_id
APF0000cia tile identifier in the Planet Four systemimage_x 3391.2 Base X coordinate [px] in original HiRISE imageimage_y 5640.6 Base Y coordinate [px] in original HiRISE imagen_votes 15
ESP_012079_0945
HiRISE image observation idspread 21.346 Spreading angle of Fansversion 1 Version number of Fan model used in Planet Four(see Appendix Appendix C)vote_ratio 1.0 Ratio of votes from a potential combination step.Value of 1.0 means only fan votes occurred.x 431.206 Base X pixel coordinate in the Planet Four tiley 160.6 Base Y pixel coordinate in the Planet Four tilex_angle -0.995088 Polar X coordinate of alignment angley_angle -0.0938355 Polar Y coordinate of alignment anglel_s 214.785 Solar longitude of HiRISE observationmap_scale 0.25 Factor for scaling distances to correct for HiRISEbinning modenorth_azimuth 126.857 Direction of North in the original unprojectedHiRISE input imageBodyFixedCoordinateX -67.2071 Base X coord. [km] in Mars-fixed ref. frameBodyFixedCoordinateY 257.05 Base Y coord. [km] in Mars-fixed ref. frameBodyFixedCoordinateZ -3370.63 Base Z coord. [km] in Mars-fixed ref. frame64lanetoCentricLatitude -85.493 Latitude of catalog object (-centric)PlanetoGraphicLatitude -85.5457 Latitude of catalog object (-graphic)Longitude 104.652 Longitude of catalog object
Appendix E.2. Blotch catalog
Column name Examplevalue Description marking_id B00004ab Consistent identifier for marking after clustering.Fxxx=Fan, Bxxx=Blotchangle 185.4 Alignment angle of marking measured from 3o’clock direction, clockwisetile_id
APF0000cia tile identifier in the Planet Four systemimage_x 3391.2 Center X pixel coordinate in the original HiRISEimageimage_y 5640.6 Center Y pixel coordinate in the original HiRISEimagen_votes 15 Number of markings used for the average objectobsid
ESP_012079_0945
HiRISE image observation idradius_1 10.4 Semi-major axis of Blotchradius_2 15.2 Semi-minor axis of Blotchvote_ratio 0.0 Ratio of votes from a potential combination step.Value of 0.0 means only blotch votes occurred.x 431.206 Center X pixel coordinate in the Planet Four tiley 160.6 Center Y pixel coordinate in the Planet Four tilex_angle -0.995088 Polar X coordinate of alignment angley_angle -0.0938355 Polar Y coordinate of alignment anglel_s 214.785 Solar longitude of HiRISE observationmap_scale 0.25 Factor for scaling distances to correct for HiRISEbinning modenorth_azimuth 126.857 Direction of North in the original unprojectedHiRISE input imageBodyFixedCoordinateX -67.2071 Center X coord. [km] in Mars-fixed ref. frameBodyFixedCoordinateY 257.05 Center Y coord. [km] in Mars-fixed ref. frameBodyFixedCoordinateZ -3370.63 Center Z coord. [km] in Mars-fixed ref. framePlanetocentricLatitude -85.493 Latitude of catalog object (-centric)PlanetographicLatitude -85.5457 Latitude of catalog object (-graphic)Longitude 104.652 Longitude of catalog object (Positive East 360)65 ppendix E.3. Planet Four tile catalog
Here we provide the data required to position the Planet Four tiles both back into HiRISE im-ages, if so required, or directly onto the Martian surface, by using the provided latitude/longitudevalues or their map-value equivalents in the BodyFixed-Mars frame in a rectangular coordinatesystem, measuring kilometers from the south pole. The coordinate values come directly from theISIS campt utility, while the x_tile and y_tile position indices of tiles inside the HiRISE image arethe result of the splitting up routine that was developed by the Zooniverse team at the beginningof the project. All coordinates were calculated at the tile center pixel coordinate of (420, 324).The decimal digits precision was set to 7, guided by the Latitude/Longitude significant bits for aHiRISE pixel diameter on the ground for a 1x1 binning observation.
Column name Examplevalue Description
BodyFixedCoordinateX -67.2071 Center X coord. [km] in Mars-fixed ref. frameBodyFixedCoordinateY 257.05 Center Y coord. [km] in Mars-fixed ref. frameBodyFixedCoordinateZ -3370.63 Center Z coord. [km] in Mars-fixed ref. framePlanetocentricLatitude -85.493 Latitude of catalog object (-centric)PlanetographicLatitude -85.5457 Latitude of catalog object (-graphic)Longitude 104.652 Longitude of catalog object (Positive East 360)tile_id
APF0000cia tile identifier in the Planet Four systemobsid
PSP_003092_0985
HiRISE observation ID of the source image for thistilex_hirise 840 X pixel coordinate of the tile center in the HiRISEimagex_tile 5 X index of the Planet Four tile inside the HiRISEimage (1-based)y_hirise 648 Y pixel coordinate of the tile center in the HiRISEimagey_tile 11 Y index of the Planet Four tile inside the HiRISEimage (1-based)
Appendix E.4. HiRISE observations catalog
This catalog provides the user with a list of HiRISE images and their meta-data that were usedto create the Planet Four results presented here. The columns with capital letters were directlytaken from the published cumulative EDR index . The decimal digits precision was set to 7,guided by the Latitude/Longitude significant bits for a HiRISE pixel diameter on the ground for a1x1 binning observation. Column name Example value Description https://hirise-pds.lpl.arizona.edu/PDS/INDEX/EDRCUMINDEX.TAB BSERVATION_ID ESP_011296_0975
HiRISE observation identifier
IMAGE_CENTER_LATITUDE -82.1965000 Planetographic latitude of the HiRISE im-age center
IMAGE_CENTER_LONGITUDE
SOLAR_LONGITUDE l_s in the fan and blotch cat-alogs.
START_TIME map_scale north_azimuth
91 the number of created Planet Four tiles perHiRISE observation. Depends on originalimage size.
Appendix F. Extended validation results
In addition to the combined fan and blotch count we explored in Section 5, we further explorehere how well the Planet Four catalog identifies fans (those dark sources with a clear directionand starting point) versus blotches, separately. We separate the catalog and gold standard classi-fications by marker type in Figures F.41 to F.44. The data processing pipeline plays a significantrole in the completeness of the catalog. At the Thresholding stage, our data processing algorithmdetermines which clusters will ultimately become fans with a value of P(fan) > 0.5. Like for thetotal number of sources, the number distribution of fans and the number distribution of blotchesmatches the expert assessments and is within the 3- σ uncertainty [Kraft et al., 1991]. Thus, inmost cases where the science team member marked a fan, the catalog also identifies this source asfan. Based on these results, we have high confidence in our fan and blotches identifications withinthe Planet Four catalog. 67
10 20 30 40 50 o f t il e s Common Expert data vs Catalog: Fans only
GPMESKMAcatalog
Figure F.41:
Comparing numbers of identified fans per Planet Four tile between experts and the catalog data; here,for the 192 tile_ids that were classified by all experts. Bin size is 5, each bin is directly compared between the datafrom all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut off at60, omitting single entry bins above.
10 20 30 40 50 60 70 8010 o f t il e s GPcatalog0 10 20 30 40 50 60 70 8010 o f t il e s MEScatalog0 10 20 30 40 50 60 70 80 o f t il e s KMAcatalog
Expert vs Catalog object identification frequency: Fans only
Figure F.42:
Comparing numbers of identified fans per Planet Four tile between experts and the catalog data. Binsize is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey) andthe catalog results (brown). Binning max was cut off at 85, omitting single entry bins above.
10 20 30 40 50 o f t il e s Common Expert data vs Catalog: Blotches only
GPMESKMAcatalog
Figure F.43:
Comparing numbers of identified blotches per Planet Four tile between experts and the catalog data;here, for the 192 tile_ids that were classified by all experts. Bin size is 5, each bin is directly compared between thedata from all experts GP (blue), MES (orange), KMA (grey) and the catalog results (brown). Binning max was cut offat 60, omitting single entry bins above. o f t il e s GPcatalog0 10 20 30 40 50 60 70 8010 o f t il e s MEScatalog0 10 20 30 40 50 60 70 80 o f t il e s KMAcatalog
Expert vs Catalog object identification frequency: Blotches only
Figure F.44:
Comparing numbers of identified blotches per Planet Four tile between experts and the catalog data.Bin size is 5, each bin is directly compared between the data from all experts GP (blue), MES (orange), KMA (grey)and the catalog results (brown). Binning max was cut off at 85, omitting single entry bins above. ppendix F.1. Example tile comparisons In Figures F.45 and F.46 we show an example comparison of volunteer’s markings with thoseperformed by the science team. The aforementiend slight deviations of the science team memberswith each other is visible, however, it is clear that the catalog wind directions in Fig. F.45 are wellreproduced by both the specialists and the volunteers. The results for blotches in Fig. F.46 are verycomparable, with the added simplification that blotches have a much reduced directivity comparedto fans.
Figure F.45:
Comparing volunteers’ markings and the resulting clustering with the markings performed by scienceteam members for Planet Four tile ID
APF0000hqn of HiRISE image
ESP_012316_0925 . The extended fan centerlines are 3 times exaggerated fan lengths to indicate the general trend of fan directions for easy visual comparison.The derived wind directions compare very well between the catalog and the science team data. igure F.46: Comparing volunteers’ markings and the resulting clustering with the markings performed by scienceteam members for Planet Four tile
APF000018t of HiRISE image
ESP_012889_0985 . The blotches are very wellcomparable between the science team and the volunteers, with slight disagreements between the science team mem-bers. eferences Alger, M.J., Banfield, J.K., Ong, C.S., Rudnick, L., Wong, O.I., Wolf, C., Andernach, H., Norris, R.P., Shabala, S.S.,2018. Radio galaxy zoo: machine learning for radio source host galaxy cross-identification. Mon. Not. R. Astron.Soc. 478, 5547–5563.Anderson, J.A., Sides, S.C., Soltesz, D.L., Sucharski, T.L., Becker, K.J., 2004. Modernization of the IntegratedSoftware for Imagers and Spectrometers, in: Mackwell, S., Stansbery, E. (Eds.), Lunar and Planetary ScienceConference, p. 2039.Aye, K.M., Portyankina, G., Thomas, N., 2010. Semi-Automatic Measures of Activity in the Inca City Region of MarsUsing Morphological Image Analysis, p. 2707. URL: http://adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=2010LPI....41.2707A&link_type=ABSTRACT .Banerji, M., Lahav, O., Lintott, C.J., Abdalla, F.B., Schawinski, K., Bamford, S.P., Andreescu, D., Murray, P., Jor-dan Raddick, M., Slosar, A., Szalay, A., Thomas, D., Vandenberg, J., 2010. Galaxy zoo: reproducing galaxymorphologies via machine learning. Mon. Not. R. Astron. Soc. 406, 342–353.Becker, K.J., Anderson, J.A., Sides, S.C., Miller, E.A., Eliason, E.M., Keszthelyi, L.P., 2007. Processing HiRISEImages Using ISIS3, in: Lunar and Planetary Science Conference, p. 1779.Bird, R., Daniel, M.K., Dickinson, H., Feng, Q., Fortson, L., Furniss, A., Jarvis, J., Mukherjee, R., Ong, R., Sadeh, I.,Williams, D., 2018. Muon hunter: a zooniverse project arXiv:1802.08907 .Bowley, C., Mattingly, M., Barnas, A., Ellis-Felege, S., Desell, T., 2018. Detecting wildlife in unmanned aerialsystems imagery using convolutional neural networks trained with an automated feedback loop, in: ComputationalScience – ICCS 2018, Springer International Publishing. pp. 69–82.Bugiolacchi, R., Bamford, S., Tar, P., Thacker, N., Crawford, I.A., Joy, K.H., Grindrod, P.M., Lintott, C., 2016.The Moon Zoo citizen science project: Preliminary results for the Apollo 17 landing site. Icarus 271, 30–48.doi: .Clancy, R., Sandor, B., Wolff, M., 2000. An intercomparison of ground-based millimeter, MGS TES, and Vikingatmospheric temperature measurements- Seasonal and interannual variability of temperatures . . . . Journal of geo-physical . . . URL: .Crowston, K., Fagnot, I., 2008. The motivational arc of massive virtual collaboration, in: Proceedings of the IFIP WG9.5 Working Conference on Virtuality and Society: Massive Virtual Communities, Lüneberg, Germany.de Villiers, S., Nermoen, A., Jamtveit, B., Mathiesen, J., Meakin, P., Werner, S.C., 2012. Formation of Martianaraneiforms by gas-driven erosion of granular material. Geophysical Research Letters 39, L13204. doi: .Ester, M., Kriegel, H.P., Sander, J., Xu, X., 1996. A density-based algorithm for discovering clusters in large spatialdatabases with noise. Kdd URL: .Ewing, R.C., Peyret, A.P.B., Kocurek, G., Bourke, M., 2010. Dune field pattern formation and recent transportingwinds in the olympia undae dune field, north polar region of mars. J. Geophys. Res. 115, E11007.Fischer, D.A., Schwamb, M.E., Schawinski, K., Lintott, C., Brewer, J., Giguere, M., Lynn, S., Parrish, M., Sartori,T., Simpson, R., Smith, A., Spronck, J., Batalha, N., Rowe, J., Jenkins, J., Bryson, S., Prsa, A., Tenenbaum, P.,Crepp, J., Morton, T., Howard, A., Beleu, M., Kaplan, Z., vanNispen, N., Sharzer, C., DeFouw, J., Hajduk, A.,Neal, J.P., Nemec, A., Schuepbach, N., Zimmermann, V., 2012. Planet Hunters: The first two planet candidatesidentified by the public using the Kepler public archive data. Monthly Notices of the Royal Astronomical Society419, 2900–2911. doi: .Fortson, L., Masters, K., Nichol, R., Borne, K.D., Edmondson, E.M., Lintott, C., Raddick, J., Schawinski, K., Wallin,J., 2012. Galaxy Zoo: Morphological Classification and Citizen Science, in: Way, M.J., Scargle, J.D., Ali, K.M.,Srivastava, A.N. (Eds.), Advances in Machine Learning and Data Mining for Astronomy. Chapman & Hall/CRC.Data mining and Knowledge Discovery, pp. 213–236.Greeley, R., Arvidson, R.E., Barlett, P.W., Blaney, D., Cabrol, N.A., Christensen, P.R., Fergason, R.L., Golombek,M.P., Landis, G.A., Lemmon, M.T., Others, 2006. Gusev crater: Wind-related features and processes observed bythe mars exploration rover spirit. Journal of Geophysical Research: Planets 111.Hansen, C.J., Thomas, N., Portyankina, G., McEwen, A., Becker, T., Byrne, S., Herkenhoff, K., Kieffer, H., Mellon,M., 2010. HiRISE observations of gas sublimation-driven activity in Mars’ southern polar regions: I. Erosion ofthe surface. Icarus 205, 283–295. doi: . ansen, G.B., 2005. Ultraviolet to near-infrared absorption spectrum of carbon dioxide ice from 0.174 to 1.8 m m.Journal of Geophysical Research 110, E11003. doi: .Hunter, J.D., 2007. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 9, 90–95. doi: .Jones, E., Oliphant, T., Peterson, P., 2001. {SciPy}: Open source scientific tools for {Python} URL: .Kaufmann, E., Hagermann, A., 2016. Experimental investigation of insolation-driven dust ejection from Mars’ CO 2ice caps. Icarus 282, 118–126. doi: .Kerber, L., Dickson, J.L., Head, J.W., Grosfils, E.B., 2017. Polygonal ridge networks on Mars: Diversity of mor-phologies and the special case of the Eastern Medusae Fossae Formation. Icarus 281, 200–219. doi: .Kieffer, H.H., 2007. Cold jets in the Martian polar caps. Journal of Geophysical Research 112, 08005. doi: .Kraft, R.P., Burrows, D.N., Nousek, J.A., 1991. Determination of confidence limits for experiments with low numbersof counts. The Astrophysical Journal 374, 344–355. doi: .Leighton, R.B., Murray, B.C., 1966. Behavior of Carbon Dioxide and Other Volatiles on Mars. Science 153, 136–144.doi: .Lintott, C., Schawinski, K., Bamford, S., Slosar, A., Land, K., Thomas, D., Edmondson, E., Masters, K., Nichol,R.C., Raddick, M.J., Szalay, A., Andreescu, D., Murray, P., Vandenberg, J., 2011. Galaxy Zoo 1: Data releaseof morphological classifications for nearly 900 000 galaxies. Monthly Notices of the Royal Astronomical Society410, 166–178. doi: .Lintott, C.J., Schawinski, K., Slosar, A., Land, K., Bamford, S., Thomas, D., Raddick, M.J., Nichol, R.C., Szalay, A.,Andreescu, D., Murray, P., Vandenberg, J., 2008. Galaxy Zoo: Morphologies derived from visual inspection ofgalaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 389, 1179–1189.doi: .Marshall, P.J., Lintott, C.J., Fletcher, L.N., 2014. Ideas for Citizen Science in Astronomy. ArXiv e-prints arXiv:1409.4291 .McEwen, A.S., Eliason, E.M., Bergstrom, J.W., Bridges, N.T., Hansen, C.J., Delamere, W.A., Grant, J.A., Gulick,V.C., Herkenhoff, K.E., Keszthelyi, L., Kirk, R.L., Mellon, M.T., Squyres, S.W., Thomas, N., Weitz, C.M., 2007.Mars Reconnaissance Orbiter’s High Resolution Imaging Science Experiment (HiRISE). Journal of GeophysicalResearch: Planets 112, E05S02. doi: .McKinney, W., 2010. Data Structures for Statistical Computing in Python, in: van der Walt, S., Millman, J. (Eds.),Proceedings of the 9th Python in Science Conference, pp. 51–56.Newman, C.E., Gómez-Elvira, J., Marin, M., Navarro, S., Torres, J., Richardson, M.I., Battalio, J.M., Guzewich,S.D., Sullivan, R., de la Torre, M., Others, 2017. Winds measured by the rover environmental monitoring sta-tion (REMS) during the mars science laboratory (MSL) rover’s bagnold dunes campaign and comparison withnumerical modeling using MarsWRF. Icarus 291, 203–231.Nguyen, T., Pankratius, V., Eckman, L., Seager, S., 2018. Computer-aided discovery of debris disk candidates: Acase study using the Wide-Field infrared survey explorer (WISE) catalog. Astronomy and Computing 23, 72–82.Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss,R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830.Peng, T.r., English, J.E., Silva, P., Davis, D.R., Hayes, W.B., 2018. SpArcFiRe: morphological selection effects dueto reduced visibility of tightly winding arms in distant spiral galaxies. Mon. Not. R. Astron. Soc. 479, 5532–5543.Perez, F., Granger, B.E., 2007. IPython: A System for Interactive Scientific Computing. Computing in ScienceEngineering 9, 21–29. doi: .Piqueux, S., Byrne, S., Kieffer, H.H., Titus, T.N., Hansen, C.J., 2015. Enumeration of Mars years and seasons sincethe beginning of telescopic exploration. Icarus 251, 332–338. doi: .Piqueux, S., Byrne, S., Richardson, M.I., 2003a. Polygonal Landforms at the South Pole and Implications for ExposedWater Ice, in: Sixth International Conference on Mars, p. 3275. URL: http://adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=2003mars.conf.3275P&link_type=ABSTRACT . iqueux, S., Byrne, S., Richardson, M.I., 2003b. Sublimation of Mars’s southern seasonal CO2 ice cap and theformation of spiders. Journal of Geophysical Research 108, 5084. doi: .Pommerol, A., Portyankina, G., Thomas, N., Aye, K.M., Hansen, C.J., Vincendon, M., Langevin, Y., 2011. Evolutionof south seasonal cap during Martian spring: Insights from high-resolution observations by HiRISE and CRISMon Mars Reconnaissance Orbiter. Journal of Geophysical Research 116, E08007. doi: .Robbins, S.J., Antonenko, I., Kirchoff, M.R., Chapman, C.R., Fassett, C.I., Herrick, R.R., Singer, K., Zanetti, M.,Lehan, C., Huang, D., Gay, P.L., 2014. The variability of crater identification among expert and community crateranalysts. Icarus 234, 109–131. doi: .Sauermann, H., Franzoni, C., 2015. Crowd science user contribution patterns and their implications. Proceedings ofthe National Academy of Sciences 112, 679–684. doi: .Schwamb, M.E., Aye, K.M., Portyankina, G., Hansen, C., Lintott, C.J., Allen, C., Allen, S., Calef, F.J., Duca, S.,McMaster, A., R. M Miller, G., 2017a. Discovery of araneiforms outside of the South Polar Layered Deposits, p.422.05. URL: http://adsabs.harvard.edu/abs/2017DPS....4942205S .Schwamb, M.E., Aye, K.M., Portyankina, G., Hansen, C.J., Allen, C., Allen, S., Calef, F.J., Duca, S., McMaster,A., Miller, G.R.M., 2017b. Planet Four: Terrains – Discovery of araneiforms outside of the South Polar layereddeposits. Icarus doi: .Schwamb, M.E., Lintott, C.J., Fischer, D.A., Giguere, M.J., Lynn, S., Smith, A.M., Brewer, J.M., Parrish, M., Schaw-inski, K., Simpson, R.J., 2012. Planet Hunters: Assessing the Kepler Inventory of Short-period Planets. TheAstrophysical Journal 754, 129. doi: .Simpson, R.J., Povich, M.S., Kendrew, S., Lintott, C.J., Bressert, E., Arvidsson, K., Cyganowski, C., Maddison, S.,Schawinski, K., Sherman, R., Smith, A.M., Wolf-Chase, G., 2012. The Milky Way Project First Data Release: Abubblier Galactic disc. Monthly Notices of the Royal Astronomical Society 424, 2442–2460. doi: .Smith, D.E., Zuber, M.T., Neumann, G.A., 2001. Seasonal Variations of Snow Depth on Mars. Science 294, 2141–2146. doi: .Smith, I.B., Spiga, A., Holt, J.W., 2015. Aeolian processes as drivers of landform evolution at the South Pole of Mars.Geomorphology 240, 54–69. doi: .Thomas, N., Hansen, C.J., Portyankina, G., Russell, P.S., 2010. HiRISE observations of gas sublimation-drivenactivity in Mars’ southern polar regions: II. Surficial deposits and their origins. Icarus 205, 296–310. doi: .Thomas, N., Portyankina, G., Hansen, C.J., Pommerol, A., 2011. Sub-surface CO2 gas flow in Mars’ polar regions:Gas transport under constant production rate conditions. Geophysical Research Letters 38, L08203–n/a. doi: .Willett, K.W., Lintott, C.J., Bamford, S.P., Masters, K.L., Simmons, B.D., Casteels, K.R.V., Edmondson, E.M.,Fortson, L.F., Kaviraj, S., Keel, W.C., Melvin, T., Nichol, R.C., Raddick, M.J., Schawinski, K., Simpson, R.J.,Skibba, R.A., Smith, A.M., Thomas, D., 2013. Galaxy Zoo 2: Detailed morphological classifications for 304 122galaxies from the Sloan Digital Sky Survey. Monthly Notices of the Royal Astronomical Society 435, 2835–2860.doi: .Zachte, E., 2012. Wikipedia Statistics Tables English. URL: http://stats.wikimedia.org/EN/TablesWikipediaEN.htm ..