[PDF] Analysis of the Wikipedia Network of Mathematicians

Abstract

We look at the network of mathematicians defined by the hyperlinks between their biographies on Wikipedia. We show how to extract this information using three snapshots of the Wikipedia data, taken in 2013, 2017 and 2018. We illustrate how such Wikipedia data can be used by performing a centrality analysis. These measures show that Hilbert and Newton are the most important mathematicians. We use our example to illustrate the strengths and weakness of centrality measures and to show how to provide estimates of the robustness of centrality measurements. In part, we do this by comparison to results from two other sources: an earlier study of biographies on the MacTutor website and a small informal survey of the opinion of mathematics and physics students at Imperial College London.

Full PDF

AAnalysis of the Wikipedia Network of Mathematicians

Bingsheng Chen, Zhengyu Lin, Tim S. EvansCentre for Complexity Science, and Theoretical Physics Group,Imperial College London, SW7 2AZ, U.K.20th December 2018

Abstract

We look at the network of mathematicians deﬁned by the hyperlinks between theirbiographies on Wikipedia. We show how to extract this information using three snapshotsof the Wikipedia data, taken in 2013, 2017 and 2018. We illustrate how such Wikipediadata can be used by performing a centrality analysis. These measures show that Hilbertand Newton are the most important mathematicians. We use our example to illustrate thestrengths and weakness of centrality measures and to show how to provide estimates of therobustness of centrality measurements. In part, we do this by comparison to results fromtwo other sources: an earlier study of biographies on the MacTutor website and a smallinformal survey of the opinion of mathematics and physics students at Imperial CollegeLondon.

Key Words: Complex networks; Social Network Analysis; Centrality Measures; Wikipedia;Crowdsourcing; History of Mathematics

Highlights • We identify the most important Mathematicians as Hilbert and Newton. • We show how to estimate uncertainty in social network measurements using diﬀerentsources. • We use a simple model of noise to test the robustness of our network measurements. • We use a survey of students to compare against social network results. • Our results show how large scale crowdsourced information can provide useful insightsinto social science questions.

The history of mathematics shows how mankind has passed ideas between eras and cultures(Berlinghoﬀ and Gouvea, 2004; Stedall, 2012). It illustrates how research in humanities topicsis usually pieced together by experts using qualitative techniques. However the arrival of theability to record and analyse large data sets has opened up new approaches for research inhumanities which can complement and support existing methods. In this paper we look atone particular example, the way that Wikipedia can be used to leverage information about therelationships between mathematicians.Wikipedia is a large set of web pages maintained by the crowd. That is anyone may editexisting pages or add a new one. There is some hierarchy, with some editors having morecontrol over some protected pages, but largely quality assurance is intended to emerge throughthe consensus of the crowd. Not surprisingly, Wikipedia covers a vast range of topics. As a1 a r X i v : . [ c s . D L ] F e b esult of this coverage and the fact that the data is open access, easily available and readilyaccessible, the information in Wikipedia has been mined in many projects. This includes severalwhich look at biographies of individuals as we do, for example see Aragón et al. (2012); Goldfarbet al. (2015); Eom et al. (2015); Ekenstierna and Lam (2016); Jatowt et al. (2016). Our workfocusses on biographies of a speciﬁc profession, mathematics.In this paper we ask a number of questions:- • Can crowdsourcing produce a useful list of individuals in one ﬁeld? • How can we use hyperlinks in Wikipedia biographies to produce a useful network? • Can links between biographies of individuals in one ﬁeld be used to produce informationon that ﬁeld? • Speciﬁcally, can these links reveal something about the importance of individuals in math-ematics? • How can we measure the uncertainties in network centrality measurements?For the ﬁrst question, we use Wikipedia’s list of “mathematicians” to show how such crowdsourced lists can be eﬀective.From such a list, we will then show how the data on mathematicians can be extracted fromWikipedia and how a simple network is a useful representation of this data. To show such thatthe relationships between individuals encoded in this network is useful, we will measure theimportance of mathematicians through network centrality measures. Centrality measures arewidely used to answer this type of question, for an overview see de Nooy et al. (2005); Newman(2010); Brandes and Hildenbrand (2014); Schoch (2016); Schoch et al. (2017).We also place a great emphasis on the robustness of our results so another important elementof our work is to show how we quantify the uncertainty in our measurements.So we will ultimately provide an answer to who is the most important person in mathematics.Our positive results will serve to support our assumption that the hyperlinks in Wikipediabiographies contain information on the importance of the web sites they point to.The Wikipedia data used in this paper, along with the code and our results of processingthis data, is available online (Chen et al., 2018). The English language Wikipedia was the primary source for this project since it contains alarge number of biographies of Mathematicians and a list of these biographies. It was alsoused because the data is open source, easily accessible and there is good supporting technicaldocumentation. Our work was primarily with Wikipedia data extracted in 2017 but we alsohave data from 2013 and 2018 available for comparison. Our data was extracted from theEnglish Language Wikipedia, processed into a network, and ﬁnally analysed using variousPython packages including NetworkX (Hagberg et al., 2008).We also used a results from an earlier project (Clarke, 2011; Hopkins, 2011) which analysed asecond web site of biographies of mathematicians, the MacTutor History of Mathematics archivecreated by John J O’Connor and Edmund F Robertson rather than being crowd sourced. Wewill use results from this earlier analysis of the MacTutor data to make comparisons with ourWikipedia results in Section 3.9. The basic methods used by Clarke (2011) and Hopkins (2011) David Schoch has produced a Periodic Table of Network Centrality (Schoch, 2016) which is a nice visuali-sation, classiﬁcation and summary of the range of network centrality measures available. The small graphs inBrandes and Hildenbrand (2014) highlight properties of the more commonly used centrality measures.

2o produce and analyse a network derived from the MacTutor biographies are very similar tothose we used for our data on the Wikipedia biographies.Finally, in Section 2.5 we describe our informal survey of undergraduate students in theMathematics and Physics departments at Imperial College London. This provided a fourth setof results and allowed further comparisons to be made.

To start our analysis we must ﬁrst deﬁne what we mean by a mathematician. We took thelist of mathematicians on English Wikipedia as our fundamental deﬁnition of who is a “math-ematician”. We started from twenty-six catalogue web pages on English Wikipedia, each ofwhich lists all the biographies of mathematicians on the English language Wikipedia whosefamily name starts with the same speciﬁed letter. For instance the “List of mathematicians(A)” page contains a link to “Aalen, Odd”. These twenty six index pages provided a list of URLs(Universal Resource Locator), here the addresses of English language Wikipedia biographiesof individual mathematicians . See Ekenstierna and Lam (2016) for alternative approaches toﬁnding sets of biographies based on the profession of individuals.The vast majority of these Wikipedia ‘mathematician’ pages are indeed biographies of indi-viduals. Whether or not an expert would call them a mathematician is, in some cases, debatable.For instance many would classify Kristen Nygaard as a computer scientist or politician ratherthan a mathematician, yet he appears on Wikipedia’s list of mathematicians.There are also one or two pages in our list which are not dedicated to a single person.For instance, the Nicolas Bourbaki page is for work produced by a variety of mainly French20th-century mathematicians under a single pseudonym, while the individual contributions ofthe three Ban¯u M¯us¯a brothers is often diﬃcult to distinguish so their work is often referred toas if it was authored by a single person. Another example is the way that the Noether Lectureis listed as an individual mathematician under “Lecture, Noether” in the 2017 data but not inour 2013 data. However we did not ﬁnd any other examples of problematic pages.We choose to leave the data unchanged and we treat all pages in the crowdsourced list asif each was the biography of an individual mathematician. We aim to see if “the wisdom ofthe crowd” can, without further intervention, provide useful information on mathematicianswhatever the strengths and weaknesses are of this crowdsourced list.Given our list of the URLs of all the biographies of mathematicians on Wikipedia, each pagewas exported to an XML source ﬁle, which we provide elsewhere (Chen et al., 2018). Fromthere, the hyperlinks between these biographical web pages were found. There is much moreinformation in these biographies and in the XML source ﬁles we provide, for instance hyperlinksto relevant topics and in the text itself, but we did not use any additional information. Each Wikipedia page, each mathematician, was represented by a unique vertex in our graph.We then add an unweighted, directed edge between a pair of vertices if there is at least onehyperlink in either direction between the two corresponding Wikipedia pages. Our hypothesis isthat the hyperlinks between these biographies of mathematicians capture relationships betweenthe academic work of these mathematicians and so these links reﬂect the way mathematics hasdeveloped. In particular an important assumption we make is that these links carry informationabout the importance of the mathematicians. Note in interpreting the 26 catalogue web pages, we used the position on the page to indicate whichhyperlinks were to biographies of Mathematicians. We did not use other links on these catalogue pages e.g.links to the Wikipedia home page. and we will ignore it.Another assumption we make is that our graph is unweighted so that all relationshipsbetween mathematicians are equally strong. In reality, the inﬂuences between mathematiciansare not equivalent and are hard to determine. The biographies also provide personal rather thanprofessional relationships between mathematicians. For example, Issac Newton’s Wikipediapage mentions that Charles Hutton commented that his belief of Newton died as a virgin whichdoes not of itself indicate a direct link between the work of the two mathematicians. It mightbe possible to perform an assessment of the nature of a hyperlink based on the text surroundinglinks, this this would either be too slow to do by hand, or would require sophisticated numericaltools beyond the scope of this paper. Instead, we wish to see how far we can go with simplertools when the data is provided on a large scale.One way to put some measure of the strength of a relationship could be to count thenumber of hyperlinks from one biography to another. Our feeling is that this may be a functionof writing style and so may not be a useful measure. For instance Arnold’s Wikipedia pagehas ﬁve references to Hilbert’s thirteenth problem and diﬀerent writers could easily have givendiﬀerent number of links to Hilbert’s biography alongside the links to the Wikipedia page onthe problem.We have ignored the direction of the hyperlinks as it is not clear what meaning the directionmight convey. Although the work of a later mathematician cannot inﬂuence the work of earlierresearchers, the Wikipedia biographies can have hyperlinks in either direction with respectto time. For instance there exist hyperlinks between Galileo (died 1642) and Newton (born1642) on both pages. In addition, while dates of birth and death can indicate the direction ofinﬂuence, most of the mathematicians in our data have overlapping lifetimes.Finally pages may have internal references but these provided no useful meaning for us andwere ignored. This gave us no self-loops and so our network was a simple graph.While there are many uncertainties surrounding the meaning of the relationships betweenmathematicians encoded in our network, the fundamental idea is that by using such a largenumber of mathematicians and links, the patterns we ﬁnd on larger scales should capture That is we assume the more famous mathematicians are more likely to be eﬀected but this eﬀect producesa similar fractional decrease in the number of their edges.

As noted above, some edges in our network may be incorrect, perhaps because of a lack ofexpertise on behalf of some editors or simply because of historical uncertainty. It is hard todetermine the validity of over ten thousand edges but we expect that such a large numberof relationships will ensure our results and conclusions are robust. To demonstrate this, wedeveloped a simple model of the noise in the network in order to judge the uncertainty in ourresults.We will simulate the process of editing a Wikipedia page as one of edge rewiring. We willremove a fraction p of the edges, representing the decision by some editor that these were poorrelationships. We will then assume that over the same period, editors will add roughly the samenumber of new hyperlinks to biographies. Furthermore, we will also assume that the editors willbe more likely to connect to biographies with many connections, so the two vertices connectedby a new edge are chosen in proportion to the original degree, the degree of the vertex beforeany edges were changed k orig .To a good approximation , the process of removing edges from a vertex starting with degree k orig is a binomial with k orig trials and a mean of (1 − p ) k orig . Likewise and the process of addingedges back to this vertex is also roughly binomial with (2 E ) p trials and an expectation valueof (2 E ) p.k orig / (2 E ) . For a vertex starting with degree k orig , the new degree, k new , is on averagethe same (cid:104) k new (cid:105) = (1 − p ) k orig + p (2 E ) · k orig E = k orig . (1)The degree of a vertex will ﬂuctuate in our model with a variance σ ( k orig ) given approximatelyby σ ( k orig ) = p (1 − p ) k orig + p E k orig E (cid:18) − k orig E (cid:19) = pk orig (cid:18) − p − k orig E (cid:19) (2)Edges will be changed by editors for many reasons but without further information, we willuse the variation in edges between given mathematicians between our 2013 and 2017 datasetsto motivate our choice for p , the level of noise in our model. We ﬁnd that that for the samepair of vertices, . of the edges in the 2013 data set were also found in the 2017 dataset.Therefore, we will choose p = 0 . and use models where of edges have been rewired toestimate the level of uncertainty in our results. While the average degree of each vertex will be Wikipedia is a website that allows users to edit its contents if the content is not protected. For all themathematician pages used here, the highest level of protection is semi-protected, which still allow users to editthe page. Ignoring the eﬀects such as the correlation between vertices connected by edges and the constraints of beinga simple graph. . (cid:112) k orig which is compatible with the numerical results shown inFig. A1 in the Appendix. There is no perfect way to deﬁne a ranking scheme in any context, in part because there isno perfect way to combine several diﬀerent ratings into one single score (Langville and Meyer,2012). In our case the diﬀerent centrality measures are all of diﬀerent numerical scales, someof which depend on normalisations in their deﬁnitions which are irrelevant constants in ourcontext. For this reason, and to simplify the presentation of our results, we chose to put all ourmeasures on a scale from 0 to 100 using a simple linear rescaling, namelyRescaled Centrality Measure = Original Centrality MeasureLargest Original Centrality Measure Value × (3) C (cid:48) v = C v C max × , C max = max( C v | v ∈ V ) , (4)where C i is the original centrality measure for mathematician i , and the C max is the largestof those values. We do this separately for each of the centrality measures we consider andall our results will be expressed in terms of these rescaled measures. For each measure, thislinear rescaling preserves the order of mathematicians as deﬁned by that measure, but it alsopreserves the relative diﬀerences in the centrality scores of mathematicians. The ﬁnal source of information on the importance of mathematicians comes from a very diﬀerentsource. We carried out an informal survey of undergraduate mathematics and physics studentsat Imperial College London. The survey contained two compulsory questions: one was thecurrent year of student, the second asked for their top three mathematicians. Participants weregiven a list of the top twenty mathematicians obtained from our social network analysis. Atthe same time, participants could nominate diﬀerent mathematicians if they were not on thelist provided. We divided the sample by year to see if increasing mathematical knowledge atUniversity had a noticeable eﬀect on the outcome. The survey was sent via email and theinformation was gathered using an online form.

In this project, we applied the method described above to two sets of Wikipedia data; ﬁrstbased on pages downloaded on 13th November 2013, the second set taken on 20th June 2017and ﬁnally the last set taken on 22nd September 2018.The resulting network for the 2013 Wikipedia data gave us vertices/mathematicians.These biographies provided a total of hyperlinks which led to undirected unweightededges in our graph. The largest connected component contained ( . ) of the mathe-maticians with undirected edges between the mathematicians in the largest component. Ofthe ( . ) mathematicians outside the largest connected component, almost all isolatedfrom each other though they typically have more links to other non-mathematician Wikipedia6ages. For instance the second largest connected component contains just ﬁve mathematiciansin total The data set downloaded in 2017 was around a third bigger in terms of the number ofWikipedia pages and links. Likewise the largest component had about 30% more edges andvertices. However despite this large change in the scale, many other properties showed verysmall change between 2013 and 2017 as shown in Table 1: the largest component still containedjust over two thirds of the nodes and the average degree, both overall and in the largestcomponent, grew by a few percent. The network built from the 2018 Wikipedia data showedfurther rises in the number of nodes (mathematicians) and edges (hyperlinks) over previousyears but the growth was broadly comparable.The lack of change in the average degree of the largest component prompted us to use oursimple model for noise as described in section Section 2.3. This keeps the number of nodesand edges the same while the degree of each ﬂuctuates by about 5% for the nodes of largestdegree. The mean degree of each node over the 1000 sample networks is roughly equal to thethat in the 2017 data. Our noise model does not keep other features of the data and we seesmall diﬀerences between 2017 data and those produced by our noise model in some of theother measures such as average path length.Quantity 2013 2017 % 2017Increase After RewiringMathematicians/Vertices +26.9% Hyperlinks +33.9%

Undirected Edges +31.9% . ± . Average Degree .

21 3 . +3.7% . Vertices in LCC +30.0% . ± . Edges in LCC +31.9% . ± . Average Degree in LCC .

71 4 . +0.6% . ± . Network Diameter

13 14 +7.7% . ± . Average Path Length .

07 5 . +1.4% . ± . Clustering Coeﬃcient .

13 0 . -7.7% . ± . Table 1: Network parameters for the 2013 and 2017 dataset, the percentage change between2013 and 2017 data, and the mean values found for an ensemble of 1000 rewired 2017 data sets(with one standard deviation uncertainty quoted) as deﬁned by our noise model of Section 2.3with p = 0 . . The LCC is the largest connected component. The degree of a node is the number of edges connected to that node. As a crude measure ofimportance, the more biographies which are connected Newton’s biography, the more likely itis that Newton’s work played an important role in either developing existing ideas or in layingthe foundations for later work. The results for our measurements of the degree for the ten Five Norwegian statisticians make up the second largest connected component. They are Erling Sverdrupplus four recipients of a prize named after Sverdrup: Dag Tjøstheim, Tore Schweder, Nils Lid Hjort, and OddAalen. Again this illustrates how the links on the Wikipedia page may not indicate any direct mathematicalconnection. However if later mathematicians are inspired or enabled by such a prize, perhaps links such as theseare just as a useful measure of esteem, an indication of the inﬂuence and legacy of one mathematician, as anyother type of link. The small change in the number of edges is due to the creation of a few self-loops which were eliminated.This eﬀect was very small and the computational implementation was not corrected to eliminate this feature.

Newton Hilbert Euclid VonNeumann Klein Aristotle Euler Gauss Leibniz Ptolemy708090100110120130 D e g r ee Original measurementExperimental measurement

Figure 1: The variation in degree under the noise model for the ten mathematicians whoseWikipedia biographies have the largest degree in the 2017 data (crosses). The circles give themean degree for the same mathematicians as measured over 1000 simulations using our noisemodel of Section 2.3 where the error bars are speciﬁed by one standard deviation.It is worth noting that, as expected from analysis of many websites, the degree distributionof our network of mathematicians has a fat tail as shown for 2017 in Fig. 2 (for 2018 see Fig. A5). Degree -4 -3 -2 -1 P r o b a b ili t y Degree -5 -4 -3 -2 -1 P r o b a b ili t y Log binned dataLinear fittting line

Figure 2: On the left, the degree distribution for the 2017 network of mathematicians. Thesame data binned with logarithmic bins (using log binning with the ratio of consecutive binedges set to be 1.5) is shown on the right and a best ﬁt straight line to this data has been added(slope is − . ± . ).The robustness of the ranking of mathematicians by their degree in the 2017 network isessential if we are to judge how important the diﬀerences in their degree ratings. To do thiswe compare our 2017 data against the results of 1000 simulations using our noise model ofSection 2.3. The ﬂuctuations, spread, skewness and outliers of ranks in simulations can be8isualized in a box-and-whisker plot in Fig. 3 (for equivalent results for 2018 data see Fig. A7). I s aa c N e w t o n D a v i d H il b e r t E u c li d J o hn v o n N e u m a nn F e li x K l e i n A r i s t o t l e L e o nh a r d E u l e r C a r l F r i e d r i c h G a u ss G o tt f r i e d W il h e l m L e i b n i z P t o l e m y R a n k Figure 3: Whisper box plot for degree rank of mathematicians from 1000 simulations of ournoise model from Section 2.3 applied to the 2017 data. The lower and upper edges of blue boxshow the percentile ( Q ) and the percentile ( Q ) of the rank of each mathematician, thered line in the middle of the box is the median. Given the small variation here, these lines oftencoincide. The black lines, at the end of the whiskers connected to the box, are deﬁned to be at Q − . Q − Q ) and Q + 1 . Q − Q ) . The remaining black crosses beyond the whiskersindicate outliers beyond the whiskers. The integer nature of degree and the small variation in rank values for the degree of the top ten mathe-maticians means that in most cases features of the Whisker and Box plot of degree coincide. However we willuse the same deﬁnition for the whiskers and box in later plots where this type of visualisation is more useful. .3 Closeness Closeness of a node is the inverse of the average shortest path length from that node to all othernodes (Bavelas, 1950; Hagberg et al., 2008; Newman, 2010) (see equation (A1) in Appendixfor the formal deﬁnition used here). We will only use the largest component in our work withcloseness. Unlike the degree, this centrality measure probes the whole structure of the network,though it does so assuming that the only important routes are the shortest paths. The ideais that that mathematician with the largest centrality has the smallest average path lengthand so will, on average, be the closest to any other mathematician. If two mathematicians areclose then the likelihood is that the work of the two mathematicians is strongly interrelated orinterdependent.The robustness of the closeness values was again estimated using our noise model and resultsfor the top ten mathematicians in 2017 are shown in Fig. 4 (see Fig. A8 for 2018 results). D a v i d H il b e r t J o hn v o n N e u m a nn E mm y N o e t h e r H e r m a nn W e y l I v o r G r a tt a n - G u i nn e ss N o r b e r t W i e n e r G e o r g C a n t o r B e r t r a n d R u ss e ll E u c li d N i c o l a s B o u r b a k i R a n k Figure 4: Whisper box plot for the rank of mathematicians by closeness, for the ten mathemati-cians with largest closeness. The closeness centrality is calculated for the largest component ofthe 2017 data and the uncertainties are estimated using 1000 simulations using the noise modelof Section 2.3 with p = 0 . . The criteria used to place the boxes and other features of the plotare as in Fig. 3. Betweenness centrality, like closeness, uses the length of the shortest path between nodes to tryto measure importance. Betweenness of a node v is the number of shortest paths which passthrough that node, summing over the shortest paths between all possible pairs of distinct nodes s and t (Freeman, 1977; Brandes, 2008; Hagberg et al., 2008; Newman, 2010). See equation(A2) in the Appendix for a formal deﬁnition.The noise model of Section 2.3 was again used to study the uncertainty in the ranking ofmathematicians based on their betweenness ratings and the results are shown in Fig. 5 (seeFig. A9 for 2018 results). 10 o hn v o n N e u m a nn D a v i d H il b e r t I s aa c N e w t o n E u c li d E mm y N o e t h e r V l a d i m i r A r n o l d N i c o l a s B o u r b a k i F e li x K l e i n L e o nh a r d E u l e r H e r m a nn W e y l R a n k Figure 5: Whisper box plot for rank by betweenness of the ten mathematicians with highestbetweenness. This is for the largest component of the 2017 data based on 1000 simulationsusing the noise model of Section 2.3. The criteria used to place the boxes and other features ofthe plot are as in Fig. 3.There are several diﬀerent ﬁelds within mathematics such as algebra, geometry, and analysis.If a mathematician works in many diﬀerent areas, individual pieces of their work may revealconnections between diﬀerent areas of maths. Such a mathematician is likely to have a highbetweenness reﬂecting the important contribution of such work. For instance, von Neumannhas the highest betweenness in our 2017 data with his Wikipedia biography suggesting hemade signiﬁcant contributions to many diﬀerent areas of mathematics; eight Wikipedia pageson diﬀerent ﬁelds of mathematics are listed on his Wikipedia biography along with furtherpages in other disciplines.However, a biography can be connected to many other mathematicians in many diﬀerentﬁelds for other reasons. Some historians of mathematics have very high betweenness too. Forinstance, this explains why Ivor Grattan-Guinness (Rice, 2015) has the -th highest between-ness in our 2017 data. The similar phenomenon was also observed in our other centralitymeasure based on the shortest path, closeness, where Ivor Grattan-Guinness was ranked ﬁfthby closeness in our 2017 data. If a mathematician is connected to a minor mathematician, then one may think that thisrelationship is of lower value than that between two famous mathematicians, say that betweenNewton and Leibniz. If all your connections are to many unimportant mathematicians we mightimagine that this of less value than having your work being valued and used by a few importantmathematicians. Eigenvector centrality (Hagberg et al., 2008; Newman, 2010) attempts totake the quality of your neighbours into account when assessing the importance of a nodeby being deﬁned in terms of a process with feedback; the larger your eigenvector centralitymeasure of your neighbours, the larger your eigenvector centrality will be. If a mathematician11ublishes a new theory, the spread of this work may be likened to a broadcasting process inthat this may be reused many times by many people. If that theory draws on results from manydiﬀerent mathematicians, this may indicate that the new work is of broad relevance and so ofhigh impact. Eigenvector centrality tries to represent this process as the long time limit of asimple broadcast process so the importance of a vertex emerges through the continual feedbackprovided by loops in the network. We perform our analysis on the largest component whichthen guarantees a unique value for each node in the largest component. Our formal deﬁnitionis given in Section A.2 of the Appendix.Unlike degree but like betweenness and closeness, eigenvector centrality probes the wholestructure of the network. However unlike betweenness and closeness, eigenvector centrality isnot based on shortest paths in the network. It turns out that the eigenvector value for eachnode can be seen as the number of very long (technically inﬁnite) walks of any type which passthrough that vertex.The robustness of the ranking of mathematicians by their Eigenvector centrality value weestimated using our noise model and the results for 2017 are shown in Fig. 6 (see Fig. A10 for2018 data). D a v i d H il b e r t I s aa c N e w t o n E u c li d A r i s t o t l e G o tt f r i e d W il h e l m L e i b n i z B e r t r a n d R u ss e ll I v o r G r a tt a n - G u i nn e ss G e o r g C a n t o r J o hn v o n N e u m a nn A r c h i m e d e s R a n k Figure 6: Whisper box plot for rank of mathematicians derived from their Eigenvalue centrality.This is for the largest component of the 2017 data based on 1000 simulations using the noisemodel of Section 2.3. The criteria used to place the boxes and other features of the plot are asin Fig. 3. 12 .6 PageRank

PageRank is a centrality measure originally used to rank websites based on the network ofhyperlinks linking websites (Brin and Page, 1998; Brandes, 2008; Hagberg et al., 2008; Newman,2010). The PageRank measure is derived from a simple process on a network. In the context ofour web page biographies, the model pictures people surﬁng a website, and then either choosinga random link on each page visited and then following that link to the next page, or sometimesjust jumping to a page chosen at random from all possible pages. While real individual users donot behave randomly, the success of search engines based on this method suggest that PageRankcan, in some situations, capture the statistical behaviour of large numbers of users using websites. As the Mathematician Wikipedia biographies are web pages, it is not unreasonable toassume that PageRank will be equally successful on our data. A more detailed deﬁnition ofPageRank is given in Section A.2.The robustness of the ranking of mathematicians by PageRank is shown in the box plot ofFig. 7 for the 2017 data (results for 2018 shown in Fig. A11). I s aa c N e w t o n D a v i d H il b e r t J o hn v o n N e u m a nn E u c li d F e li x K l e i n L e o nh a r d E u l e r A r i s t o t l e C a r l F r i e d r i c h G a u ss G o tt f r i e d W il h e l m L e i b n i z E mm y N o e t h e r R a n k Figure 7: Whisper box plot for rank of mathematicians derived from their PageRank ratings.This is for the largest component of the 2017 data based on 1000 simulations using the noisemodel of Section 2.3. The criteria used to place the boxes and other features of the plot are asin Fig. 3. 13 .7 Comparison of Diﬀerent Centrality Measures

Each diﬀerent centrality measure deﬁnes ‘important’ in a diﬀerent way. While there are manyaspects to importance, there are a very large number of diﬀerent centrality measures, seeSchoch (2016) for a nice visualisation of this. So we should not be surprised if some of themany deﬁnitions of centrality measure pick up on similar aspects of centrality and so givesimilar results. This we can see by looking at the correlations between centrality measures,a subject with a long history, for example see Valente et al. (2008), Schoch et al. (2017) andreferences therein.Since the centrality scores are not generally normally distributed, we will not rely on thePearson Correlation coeﬃcient to assess these correlations but we will also use the alternativeSpearman’s Rank Correlation Coeﬃcient (the Pearson correlation applied to the ranked valuesof the centrality measures).largest component Degree PageRank Eigenvector Betweenness Closeness AverageDegree 1.00 0.98 0.82 0.86 0.57 0.95PageRank . , though that stillrepresents a fair correlation. In general we expect considerable correlation if all the centralitymeasures are inﬂuenced the same aspects of importance.However interpreting such summary statistics is diﬃcult here because of the correlationmeasures for the largest component are also strongly eﬀected by the fat tail, the large numbersof mathematicians with low centrality values. For instance, the betweenness value as a functionof rank by betweenness is roughly a power law distribution for the two thousand mathematiciansbut then the distribution shows a sharp cutoﬀ. In particular, over two thousand mathematiciansin the largest component have exactly zero betweenness. The discrete values of betweennessand degree leading to many common values are an additional factor. There are over a thousandmathematicians in the largest component with degree and betweenness . The vertical gapsin Fig. 9 illustrate the discreteness problem for betweenness. So some scatter plots appearto show a lack of correlation, as Fig. 9 suggests at ﬁrst glance, but the correlation measuresare high, pulled up by similar values for many low-valued mathematicians. Overall, we haveto be very careful in interpreting these correlation measures and scatter plots for the largestcomponent.However, even for low ranked mathematicians, interesting results can be found by looking foroutliers. It is clear from Fig. 8 and Fig. 9 that while there is sometimes a general correlation ortrend, there are many individual exceptions. This is where more sophisticated measures tailoredto a particular context and question are needed, or perhaps simply where an expert opinion isrequired. For example, Solomon Kullback has a degree and a PageRank which on our scale are2.3 and 10.2 respectively (out of 100) ranking him 2088 and 471 on each measure respectively.14 Rank sorted by Closeness R a n k s o r t e d b y E i g e n v e c t o r Figure 8: Scatter plot of ranks in Eigenvector and Closeness for the largest component fromthe 2017 data. A strong positive correlation can be seen matching the Spearman correlationvalue found of . . Rank sorted by Betweenness R a n k s o r t e d b y E i g e n v e c t o r Figure 9: Scatter plot of ranks in Betweenness and Eigenvector for the largest componentfrom the 2017 data. The points appear quite widely distributed with little correlation yet theSpearman correlation is . . The issue is that around 40% of the mathematicians have zerobetweenness producing the vertical line of points around rank 3600. Those mathematicianstend to have low eigenvector values (high rank by eigenvector) but this cannot be seen in thisscatter plot.His position in government agencies probably limits his known links to mathematicians, hencethe low degree, yet his PageRank suggests that his work links him to important developmentsin mathematics. Ivan Rival has the same values with few links to mathematicians yet hisrole as editor of a key journal of discrete mathematics, “Order”, may link him to particularly15mportant mathematicians. Perhaps an editor of a leading journal can have a major inﬂuenceon mathematics.as indicated by the higher than expected PageRank in this case.Our data sets have a very large number of low rated mathematicians and we expect theirproperties to be particularly noisy. For instance, their fat-tailed degree distributions seen inFig. 2 (and in Fig. A5) show that changing even one hyperlink in their biographies is a largeproportionate change in this measure. So when discussing correlations, it makes much moresense to look at a smaller group of highly rated mathematicians. For instance if we restrictourselves to the top thirty ﬁve mathematicians, we ﬁnd ties in value of a centrality measure arerare even for integer valued degree. The correlation measures in this case are shown in Table 3and in Fig. 10. From the correlation matrix for the top thirty ﬁve mathematicians, we foundthat the degree measure is correlated very strongly with the PageRank centrality measure, afeature often seen with these two measures . Closeness is still poorly correlated except withwith the other measure based on shortest path measures, betweenness.Top 35 Degree PageRank Eigenvector Betweenness Closeness AverageDegree 1.00 0.98 0.74 0.78 0.36 0.96PageRank . (betweenness), . (Eigenvector), . (PageRank), . (degree), . (Closeness), and . (average score).Thus closeness appears to be noticeably more robust than the other centrality measures.While this does not answer the question if it is a good measure of importance in all contexts,the reliability of closeness does make it a more useful measure. On the other hand, betweennessis noticeably less stable than other measures, suggesting we should not rely on it as an indicatorof importance. It is interesting that closeness and betweenness were well correlated for highlyranked mathematicians and that both rely draw the same set of shortest paths. However,closeness is an average over all shortest paths from one vertex, while betweenness counts justa few passing through a given vertex. So again, the instability of betweenness seems to comefrom its reliance on a few measurements. That aspect of betweenness is also why betweennessvalues are taken from a relatively small pool of likely rational (often integer) values leading tomany ties, an issue we highlighted when discussing correlations above. For an undirected and connected graph as we have here for the largest component, if we set α = 1 in (A4)we can show that PageRank is proportional to degree. Rank from average score R a n k s DegreeBetweennessClosenessEigenvectorPagerank

Figure 10: A comparison of the rank of mathematicians under diﬀerent centrality measures.The horizontal axis is the rank of each mathematician by their average score; the top 35 areshown. Note that as the rank gets higher, there is a small but increasing variation in the ranksby diﬀerent centrality measures for each mathematician.

The results of our measurements of ﬁve centrality measures on the network derived from the2017 Wikipedia biographies of mathematicians are shown in Table 4. The equivalent resultsfor the 2013 and 2018 data sets may be found in Table A2 and Table A4 respectively.The simplest way to combine these diﬀerent centrality ratings is to take the average of ourcentrality measures, remembering that each is rescaled according to (4). We have not, however,then rescaled our “average” score and for that reason the highest average rating is less than100. We have indicated this in our tables of results, Table 4, Table A2 and Table A4. Thissimple average puts Hilbert as the most important mathematician with Newton only a shortway behind. The third most important mathematician according to this average rating is vonNeuman who is some way behind in most ratings.However this is where it becomes important to estimate the uncertainty in these results.One way is to look at how diﬀerent ways to combine ratings or rankings produce diﬀerentresults. There is no perfect way to do this and so there are many options (Langville and Meyer,2012). We use the simplest approach; we will simply count who achieves the most number onerankings when considering each centrality measure individually. Doing that we see that withone exception (betweenness in 2017), either Hilbert or Newton always has the highest centralitymeasure in either the 2013 or 2017 data. By this way of looking for the best mathematician,there is little to choose between Newton and Hilbert as both are the highest in two of our ﬁvecentrality measures in 2017. In fact Newton has is top in three centrality measures in 2013 soby this scheme and data he could be deemed better than Hilbert.We will use a variation of this approach and consider a second way to combine scores, onewhich produces a nice visualisation. Formally we construct a partially ordered set, a poset, fromthe set of mathematicians and the relationship between their rankings between their rankings(see Bruggemann et al. (1994) and Loach and Evans (2017) for examples from diﬀerent contexts17 ame Degree mark Betweenness mark Closeness mark Eigenvector mark PageRank mark Average mark RankDavid Hilbert .

25 94 .

58 100 100 88 .

86 95 .

14 1

Isaac Newton

100 70 .

66 90 .

84 98 .

02 100 91 . John von Neumann .

07 100 97 .

34 66 .

01 85 .

62 85 .

61 3

Euclid .

05 66 .

78 92 . .

71 84 .

19 83 .

62 4

Felix Klein .

87 43 .

71 91 .

63 58 .

53 70 .

53 67 .

45 5

Aristotle .

67 26 .

59 84 .

39 81 .

67 63 .

01 64 .

47 6

Leonhard Euler .

79 40 .

25 89 . .

36 65 . .

12 7

Gottfried Wilhelm Leibniz .

26 29 .

01 89 .

59 76 .

44 52 .

99 60 .

46 8

Bertrand Russell .

39 31 .

97 92 . .

74 48 . .

78 9

Emmy Noether .

16 46 .

98 93 .

96 45 .

08 52 .

28 57 .

89 10

Carl Friedrich Gauss .

59 38 .

69 89 .

97 40 .

51 59 .

68 57 .

09 11

Hermann Weyl .

96 39 .

09 93 .

88 50 .

33 45 .

33 54 .

72 12

Ivor Grattan-Guinness .

53 29 .

05 92 . .

67 35 .

56 53 .

52 13

Georg Cantor .

86 26 . .

64 66 .

39 37 .

58 53 .

06 14

Nicolas Bourbaki .

86 45 . .

73 24 .

53 45 .

66 49 . Charles Sanders Peirce .

21 23 .

76 90 .

47 57 .

58 35 .

28 48 .

86 16

Norbert Wiener .

98 31 .

64 92 .

65 35 .

59 41 .

21 47 .

82 17

Galileo Galilei .

51 13 .

79 80 .

66 52 .

93 43 .

61 47 . Archimedes .

51 9 . .

03 58 .

62 41 . .

37 19

Vladimir Arnold .

11 46 . .

47 19 .

97 44 .

51 46 .

89 20

Ptolemy .

71 9 .

45 75 .

02 40 .

39 49 .

44 45 . Christiaan Huygens .

76 13 . .

02 50 .

83 37 .

83 45 .

19 22

Johannes Kepler .

64 15 .

23 80 . .

99 41 .

22 45 .

14 23

G. H. Hardy .

98 31 .

58 89 .

74 23 .

12 42 .

86 45 .

06 24

Alan Turing .

98 24 .

83 88 .

18 29 .

74 41 .

47 44 .

44 25

Michael Atiyah .

09 34 .

02 87 .

25 11 .

88 47 .

78 44 . Alfred Tarski .

53 23 86 .

23 31 .

27 41 .

11 44 .

23 27

Alexander Grothendieck .

64 25 .

83 85 .

76 11 .

83 45 .

71 42 .

35 28

Bernhard Riemann .

01 19 .

04 90 .

88 39 .

16 28 .

72 41 .

76 29

George Boole .

78 12 .

24 84 .

67 42 .

09 30 .

74 40 . Andrey Kolmogorov .

11 27 .

43 85 . .

77 42 .

47 40 .

21 31

William Rowan Hamilton .

46 19 .

72 88 .

66 32 .

93 29 .

63 40 .

08 32

Emil Artin .

01 23 . .

34 21 .

27 35 .

42 39 .

93 33

Alfred North Whitehead .

91 13 . .

22 42 .

91 27 . .

63 34

Martin Gardner .

78 25 .

93 84 .

56 16 .

82 38 .

81 39 .

58 35

Table 4: Centrality scores for top 35 mathematicians from the 2017 data (without noise), eachmeasure rescaled so the largest value is 100. Mathematicians are then ordered by the averagemark.and further references). In this case, for our set of mathematicians we say A (cid:31) B if each ofthe ratings for mathematician ‘A’ is better than the corresponding rating for ‘B’. As Newtonhas a higher closeness than Newton but Newton has the higher degree of the two, we cannotassign any relationship between these two in this poset. However Hilbert and Newton bothhave a higher rating than Euclid for all ﬁve centrality measures, so we can write this fact as Newton (cid:31)

Euclid and

Hilbert (cid:31)

Euclid . We can then identify the ‘top’ nodes as those nodes T for which there is no mathematician A such that A (cid:31) T . In our case we ﬁnd that for almostany set of ratings, we have that for the 2013 data we have just two top nodes: Newton andHilbert (see Fig. A2). For the 2017 case, as shown in Fig. 11, we ﬁnd that in addition to thesetwo we have a third mathematician at the top of our poset, von Neumann. This is because hisbetweenness is the highest for the 2017 Wikipedia data as Fig. 11 shows.This poset structure also allows us to split our mathematicians into subgroups, each ofwhich has similar ratings but where each group is lower rated than the previous group. Thisis done by measuring the ‘height’ of each mathematician within the poset. To ﬁnd the heightof mathematician ‘A’ you have to ﬁnd the sequence mathematicians, a ‘chain’, from one of thesource tops nodes to mathematician ‘A’, that is M = { M i } where M i (cid:31) M i +1 , the ﬁrst nodeis a top node M = T and the last node is the mathematician of interest M (cid:96) = A . The heightis the number of nodes in the longest chain minus one. Note the top nodes have height zero.The result is shown in Fig. 11 for the 2017 Wikipedia data (see Fig. A2 for the 2013 data). For instance the height of Leibniz is 2 because of the chain { Hilbert, Euclid, Leibniz } . The longest chainfrom the other top node is just { N ewton, Leibniz } . . Equally, while there is less change at the top of our lists from 2013 to 2017,we still see small changes as high as the ﬁfth and sixth place. Again the results from our noisemodel in Table 5 conﬁrms this behaviour, in this case showing our ﬁfth place Klein and sixthplace Aristotle are too close to be sure of their relative position. Similar variations can be seenwhen comparing to the 2018 data shown in the Appendix, e.g. Table A4.As always, it is the outliers that attract attention. Having set the scale of the expectedchanges, the mathematician in the top 35 list from 2017 who shows the biggest change isNicolas Bourbaki which is actually a pseudonym for a group of mainly French 20th-centurymathematicians. He moved from 42nd in 2013 to 15th in 2017. This suggests this EnglishWikipedia article has undergone unusual and substantial expansion e.g. the number of hyper-links to other identiﬁed mathematicians has gone from 34 to 54 in these four years.Finally we note the presence of a few names who perhaps did not contribute directly tospeciﬁc developments in mathematics. The historian of mathematics Grattan-Guinness Rice(2015) has a high rank because he is linked to so many mathematicians. However we alsonote that Martin Gardner is 35th in the 2013 list (and 36th in 2017 so just oﬀ our table).His role in mathematics is as one of the best known popularisers of mathematics working inthe English language in the second half of the 20th century, illustrating that you can makeimportant contributions to mathematics in many diﬀerent ways. How many mathematicianstoday were inspired by Gardener’s work? The results we have obtained can be compared with those of a diﬀerent data base, the MacTutorHistory of Mathematics archive created by John J O’Connor and Edmund F Robertson. This isa web site of biographies of famous mathematicians, with hyperlinks between these biographies.A network was constructed by again setting each biographical web page to be a vertex. Asfor the Wikipedia biographies, a vertex almost always represented a single mathematician . Adirected edge was assigned from one mathematician to another if there was at least one hyperlinkbetween the two biographies. The major diﬀerence between MacTutor and Wikipedia is thatthe MacTutor pages are not open to the public but are curated and written by O’Connor andRobertson. One result is that our MacTutor network has only 2249 vertices/mathematiciansand 16980 directed edges between them, roughly a third of the size of our Wikipedia networks.This has been analysed by one of us (TSE) working with several other researchers but theresults we quote here are based on the analysis of the data from late 2010 described in Clarke(2011). In particular, the centrality measures (for the directed network) for the top ﬁfteenmathematicians are reproduced from Clarke (2011) in Table A1.The results for the top mathematicians are very similar to those we obtained from theWikipedia data. This is a further check of the robustness of our results. It also suggests thatfor centrality measures there is not much diﬀerence in results between the use of directed and The turnover in such ranked lists has been studied in other contexts but those techniques and suggestedpower-laws would require data over a longer period to be useful here (Bentley et al., 2007; Evans and Giometto,2011). The only two exceptions known were for the pages dedicated to the work of collectives of Nicolas Bourbakiand the Ban¯u M¯us¯a brothers which were also represented by a single vertex. ame Degree Betweenness Closeness Eigenvector PageRank Average RankDavid Hilbert . ± . . ± .

38 99 . ± .

05 97 . ± .

35 88 . ± .

03 94 . ± .

83 1

Isaac Newton . ± . . ±

10 94 . ± .

38 92 . ± .

55 99 . ± .

37 93 . ± .

93 2

John von Neumann . ± .

43 93 . ± .

64 97 . ± .

14 65 . ± .

62 82 . ± .

87 83 . ± .

42 3

Euclid . ± .

71 65 . ± .

54 94 . ± .

33 82 . ± .

47 85 . ± .

05 82 . ± .

23 4

Felix Klein ± . . ± .

27 93 . ± .

21 57 . ± .

04 70 . ± .

28 69 . ± .

02 5

Aristotle . ± . ± .

88 90 . ± .

46 72 . ± .

32 64 . ± .

15 66 . ± .

28 6

Leonhard Euler . ± . . ± .

63 91 . ± .

41 54 . ± .

99 64 . ± .

19 63 . ± .

77 7

Gottfried Wilhelm Leibniz . ± .

45 34 . ± .

53 92 . ± .

31 68 . ± .

89 53 . ± .

66 60 . ± .

58 8

Bertrand Russell . ± .

27 34 . ± . . ± .

33 63 . ± .

36 48 . ± .

32 58 . ± .

42 9

Emmy Noether . ± .

28 46 . ± .

96 94 . ± .

22 45 . ± .

08 51 . ± .

42 57 . ± .

43 10

Carl Friedrich Gauss . ± . . ± .

67 90 . ± .

31 40 . ± .

97 58 . ± .

82 57 . ± .

13 11

Hermann Weyl . ± .

98 37 . ± .

72 93 . ± .

22 48 . ± .

07 44 . ± .

07 53 . ± .

96 12

Ivor Grattan-Guinness . ± .

81 27 . ± .

89 93 . ± .

14 62 . ± .

75 36 . ± .

77 52 . ± .

84 13

Georg Cantor . ± .

87 26 . ± .

72 93 . ± .

19 59 . ± .

78 38 . ± .

82 51 . ± .

83 14

Galileo Galilei . ± .

12 20 . ± .

48 86 . ± .

67 47 . ± .

22 45 . ± .

32 49 . ± .

05 15

Archimedes . ± .

12 17 . ± .

91 85 . ± .

64 51 . ± .

45 43 . ± .

08 48 . ± .

97 16

Nicolas Bourbaki . ± .

95 39 . ± .

68 91 . ± .

33 26 . ± .

91 44 . ± .

18 48 . ± .

69 17

Ptolemy . ± .

28 18 . ± .

12 83 . ± . ± .

33 51 . ± . . ± .

79 18

Charles Sanders Peirce . ± .

53 23 . ± .

41 91 . ± .

29 51 . ± .

39 35 . ± .

59 47 . ± .

56 19

Norbert Wiener . ± .

75 30 . ± .

65 92 . ± .

29 35 . ± .

25 40 . ± .

97 47 . ± . Johannes Kepler . ± .

84 20 . ± .

15 86 . ± .

61 41 . ± .

47 41 . ± .

99 46 . ± .

65 21

Christiaan Huygens . ± .

75 16 . ± .

46 87 . ± .

46 44 . ± . . ± .

88 45 . ± .

41 22

Michael Atiyah . ± .

82 33 . ± .

07 88 . ± .

53 15 . ± .

14 45 . ± .

19 44 . ± .

35 23

G. H. Hardy . ± .

69 30 . ± .

83 89 . ± .

37 24 . ± .

44 41 . ± .

96 44 . ± .

26 24

Alexander Grothendieck . ± .

99 30 . ± .

48 87 . ± .

61 16 . ± .

23 44 . ± .

16 44 . ± .

52 25

Alfred Tarski . ± .

85 24 . ± . . ± .

42 29 . ± .

08 40 . ± .

01 44 . ± .

27 26

Vladimir Arnold . ± . . ± .

26 89 . ± .

54 20 . ± . . ± .

07 44 . ± .

41 27

Alan Turing . ± .

73 24 . ± .

27 88 . ± . . ± .

08 40 . ± .

94 43 . ± .

25 28

Bernhard Riemann . ± .

35 18 . ± .

48 90 . ± .

17 37 . ± .

26 29 . ± .

33 41 . ± .

92 29

Nicolaus Copernicus . ± . . ± .

13 84 . ± .

78 40 . ± .

68 33 . ± .

67 41 . ± .

45 30

George Boole . ± .

43 15 . ± .

06 87 . ± .

38 37 . ± .

73 30 . ± .

47 40 . ± .

02 31

Pierre de Fermat . ± .

27 14 . ± .

14 89 . ± .

33 44 . ± . . ± .

22 40 . ± .

99 32

Andrey Kolmogorov . ± .

58 26 . ± .

15 85 . ± .

57 13 . ± .

76 39 . ± . ± .

93 33

Emil Artin . ± .

35 23 . ± .

98 88 . ± .

36 22 . ± .

24 33 . ± . . ± .

84 34

Gaetano Fichera . ± .

59 21 . ± .

96 85 . ± . . ± .

72 39 . ± .

82 35

Table 5: Centrality scores for top 35 mathematicians derived from the the noise model describedof Section 2.3 applied to the 2017 data with p = 0 . for 1000 simulations. The mean value andone standard deviation is quoted for each centrality measure for each mathematician. As thescores for each run are always rescaled so that the largest value is 100, explaining why the valuequoted for any one centrality measure is always less than 100. The column marked averagegives the average over the ﬁve named centrality measures with associated standard deviation.Mathematicians are ordered in terms of this average and the ranks given are in terms of thisaverage over centrality values.undirected networks based on hyperlinks between biographies. The similarity between resultsbased on Wikipedia and MacTutor data is not so surprising. Both sets of biographies arewritten in English which might suggest common biasses. Both web sites are free to view andso it is very likely that a writer for one website consciously or unconsciously drew on materialfrom the other web site.To get a rough idea, we if we average MacTutor the ranks across the four classic centralitymeasures we used for the Wikipedia data we ﬁnd the following: Newton 1.5, Euclid 2.5, Hilbert3.75, Riemann 4.2, and Euler 6.0. Two diﬀerences stand out when compared to the Wikipediadata. First Riemann is fourth by the MacTutor data but is 29th in the 2017 Wikipedia data.Secondly, Wikipedia rates von Neumann as the third most important mathematician whilethe analysis of the Mactutor biography does not see him in the top ten. Again, given thesimilarity of other results this seems to highlight diﬀerences in the interests or expertise of theeditors of these web sites, or perhaps in the procedures which lead to the public versions of the21iographies. The ﬁnal comparison we make is with our informal survey of undergraduate students studyingmathematics or physics at Imperial College London. The results of this survey are shown inTable 6.Mathematician Year 1 Year 2 Year 3 Year 4 & Other Total votes 2017 RankLeonhard Euler 80 78 54 34 246 7Friedrich Gauss 57 52 46 26 181 11Issac Newton 55 50 38 23 166 2Euclid 48 43 29 16 136 4Leibniz 25 19 12 4 60 8David Hilbert 13 9 13 7 42 1Aristotle 15 9 7 2 33 6Alesso Corti 4 13 9 5 31 1019Emmy Noether 6 0 10 10 26 10von Neumann 2 3 8 7 20 3Table 6: The top 10 mathematicians as derived from the results of our questionnaire sent inNovember 2016 to undergraduates in the Physics and Mathematics departments of ImperialCollege London. The 2017 Rank is from the Wikipedia data.Nine of mathematicians mentioned come from the top eleven of the 2017 Wikipedia (seeTable 4) which at ﬁrst sight appears to show a good general consistency between undergraduatesand the web site data discussed above. After all, all but one mathematician nominated byparticipants is in our top twenty. Either our social network analysis is a good reﬂection ofinformed student opinion or it is merely a reﬂection of the way we constructed the surveysince participants were given the list of top twenty names from our social network analysis tochoose from (though they could add other names as one entry shows). Since eight of the ninemathematicians chosen by participants came from the top eleven of the list provided, we feelthat this shows that participants were not unduly biased by the list provided otherwise wewould have seen more names from those ranked below eleventh.However it is also interesting to see that there is a very diﬀerent order here as comparedto that found with network analysis of the web sites. In particular, Hilbert and von Neumanare considerably underrated by undergraduates as compared to the Wikipedia and MacTutorrankings. This may reﬂect the direct impact or simply a lack of visibility in the undergraduatesyllabus followed by these students.It is also interesting that there is also good consistency between students in diﬀerent yearswith the ranking of the top four (taking around 80% of the votes) being identical. This sug-gests that the amount of training appears to have relatively little eﬀect on the choice of bestmathematician by these undergraduates.Finally, the ‘joie de vivre’ of these undergraduates is clearly evident as the head of themathematics department at the time of the survey appears as the most recent mathematicianon this list. 22

Discussion

The simplest conclusion is that on the basis of our Wikipedia data we would suggest that thetwo most important mathematicians are Hilbert and Newton. We have shown that we can alsoput a estimate of the uncertainty around such ratings by using simple models of the noise in thesystem. Since we have Wikipedia data separated by four or ﬁve years, we have also been ableto use the changes in the rankings of mathematicians over four or ﬁve years to get an estimateof the uncertainties in our ranking. That means we are fairly conﬁdent in our results for thetop four while for those ranked around thirty, we already suggest that an uncertainty of sevenor so places is consistent with our analysis.We’ve also shown how diﬀerent sources can be used to provide further checks on the robust-ness of our conclusions. An independent web site created by John J O’Connor and EdmundF Robertson, MacTutor (MacTutor History of Mathematics archive, O’Connor and Robertson(2017)), gives broadly similar results (Clarke, 2011; Hopkins, 2011).Even our informal survey of undergraduates is fairly consistent with the results from thelarger Wikipedia and MacTutor studies. Where the comparisons are most interesting is in thediﬀerences. In particular von Neumann is the third most important mathematician on theWikipedia data but is far lower on the results from the survey and MacTutor. Is this dueto an over representation of Wikipedia editors who have an interest in computer science whoparticularly admire von Neumann’s contributions in that ﬁeld? Conversely perhaps the Britishhigher educational system in mathematics, be it the teachers who are the editors of MacTutoror the students answering the survey, fail to give due weight to von Neumann’s work becauseof its importance to a separate ﬁeld, Computer Science. This shows that simple quantitativemeasures provide useful information but expert opinion is still required to understand manydetails.It would be interesting to see how other lists of great mathematicians compare against theones produced using our methods. One approach would be to use well established measures ofesteem to either rank mathematicians, for instance using bibliometric methods such as citationcount or h-index. However the traditional bibliometric measures are ﬂawed when making com-parisons across large time scales and across many diﬀerent topics. The list of mathematiciansawarded prizes, such as the Fields medal, could produce sets of great mathematicians, if noprecise ranking. Such a list highlights some of the advantages of our approach. Each prizehas constraints in terms of subject about which one might argue. Should Ed Witten, whois typically described as a ‘physicist’, have been awarded a Fields medal, the Nobel Prize ofmathematics? Witten is, in fact, in our data but does not make our list of top 35 mathemati-cians. Other constraints apply to prizes too. The Fields medal is awarded only to those underthe age of 40. Prizes are often only awarded only to living people and so prizes do not havethe historical reach of our Wikipedia approach. In fact, only two Field medal winners are inour top 35 mathematicians based on the 2017 data in Table 5, Atiyah and Grothendieck whowere awarded the medal in 1966. Of the others in our top 35 mathematicians, only Turing waseligible for the Fields medal illustrating a drawback of lists of winners of aged-limited prizes.The selection process for most prizes is secret. Alan Turing, who is typically ranked around20th to 30th in our ratings, was young enough and recent enough to have been awarded a Fieldsmedal but that did not happen. Yet Turing has a whole prize named after him, surely an evenbigger measure of esteem. Prizes, like the many ad-hoc lists of great mathematicians producedby expert opinion, are created by hidden processes with unknown biases. One might disagreewith choices of those listed as being a mathematician on Wikipedia or with the links madebetween them, or indeed with the measure we have used to arrive at our conclusions. Howeverat least our approach is completely open unlike most alternative approaches.The comparison of our lists with the list of Fields medalists highlights that time has an23mportant eﬀect. Most of our top 35 mathematicians did their work over a hundred yearsago. It seems likely that it was easier to have a larger impact on mathematics when thesubject is young and our list reﬂects that. It would, though be interesting to compare like withlike, perhaps comparing modern mathematicians using our methods and modern bibliometricmethods. Does our Wikipedia based rating for a mathematician lag behind the citation countof their work? For modern mathematicians, it would be interesting to compare our rankingswith those available from other sources: prizes, bibliometric measurements etc. However, thatis a diﬀerent project. Name disambiguation is a serious problem in matching lists, the eﬀectof time and ﬁeld makes comparison of even modern mathematicians diﬃcult. Our approach,perhaps even our data, could provide a starting point for such a project.An important assumption in our work is that our deﬁnition of who is a “mathematician”is a good one. For our main data set, we used the list of mathematicians provided on EnglishWikipedia. Interestingly, our crowdsourced deﬁnition agrees with the the Fields medal commit-tee in that it includes a ‘physicist’ (Witten) in the list of mathematicians. Our crowdsourcedcategorisation brings with it the strengths, and weaknesses, of that approach. We can contrastthis with the approach used to compile the list of mathematicians in the MacTutor database,which is based on the expert opinion of a pair of curators. It is another important result of ourwork that these two diﬀerent approaches to deﬁne a collection of top mathematicians have pro-vided comparable results. Our results provide further evidence that a crowdsourced approachto diﬃcult questions can be an eﬀective and reliable method.It is worth challenging this assumption further. Suppose we pick a mathematician at randomfrom our data. Given our data is fat-tailed e.g. in terms of degree, we are very likely to picka lowly ranked mathematician. The example of Kristen Nygaard is instructive as many of uswould probably classify him as a computer scientist. Nygaard developed the core conceptsof object-oriented programming for which he was awarded the Turing award, the Nobel prizeor Fields medal equivalent for computer science. However, computer science often overlapswith mathematics, as Turing himself demonstrates. In addition, Nygaard had a masters inmathematics and worked for a time in operational research. The authors and referees mightwell use their expert opinions to exclude Nygaard from their optimal list of mathematiciansin which case we might think our Wikipedia crowdsourced list of mathematicians containsmistakes.This is not a good viewpoint in our opinion. Diﬀerent experts will always have diﬀerentopinions. It is easy to say two lists are diﬀerent, it is diﬃcult, if not impossible, to say if onelist was better than another. Most people would agree that Nygaard is at best a marginal case,mostly a computer scientist and only peripheral to the development of mathematics. Manyother people in a similar position to Nygaard could be in our list of mathematicians. Equallythere could be many others who are not in our list of mathematicians but who have a goodcase to be included. Essentially, any deﬁnition of a mathematician is uncertain, and that is asource of noise in any list.However the example of Kristen Nygaard also shows why our method is so powerful. If some-one like Nygaard is included, the number of links to other mathematicians in their Wikipediabiography gives a good indication of how central they are to mathematics. Nygaard’s page hasmany links to people we could classify as computer scientists, many to those involved in Norwe-gian politics but only one to another mathematician in our database. If a mathematician is notparticularly important to mathematics then our network representation will place that personplaced in a peripheral position in the network. Including, or indeed excluding, that person willhave very little eﬀect on our results. Our network method gives us some protection againstuncertainties in the deﬁnition of a mathematician. People who are listed as mathematiciansyet most of us would regard as marginal to the development of mathematics, are likely to havebiographies with few if any connections to the largest connected component in our network, so24uch marginal cases will be lowly ranked and will have little eﬀect on the rating of others.Many of the problems in our data are most severe for lower ranked mathematicians. Theirbiographies are likely to have been read and checked less often, they have fewer links so eachlink becomes relatively more more important for that mathematician. Again Kristen Nygaardprovides a nice example. His web page is extensive with many links but very few are to pagesof mathematicians and it is likely that most readers are not interested in any of his links tomathematics. So those links are liable to be noisier and less reliable, just as his inclusion in thelist of mathematicians at all is debatable. The fat-tailed distribution of our measures showsthat most mathematicians have low ratings. Hence a small change in rating can produce a largechange in rank. This just emphasises that robustness checks are a vital part of any analysis,yet so often missing from discussions based on expert opinion alone. Our approach protects usagainst such uncertainty.Another important conclusion is that our results support a key assumption we make, namelythat the hyperlinks in Wikipedia biographies do contain useful information about the impor-tance of individual mathematicians. Since our lists of top mathematicians look sensible, ourown expert judgement, since they are relatively consistent over the three years of Wikipediadata we use, since they match well with a similar analysis of the MacTutor data, and since an in-formal survey is in rough agreement, it does suggest our method is reliable. This then supportsour assumption that the hyperlinks reﬂect importance. In particular, we do not need to look atthe context of each hyperlink to extract this information on importance. Of course, had we areliable way to look at the context of each link, to reject those which were not useful (e.g. a linkfrom mathematician A to mathematician B in text reading “mathematician A never knew ofthe work by mathematician B”), then we might make our analysis more accurate and so reducethe uncertainties we place on our results. The success of our assumption is not too surprisingexpected as search engines use the hyperlink structure to successfully rank web pages as usedin their recommendations to people searching the web. However it is interesting and non-trivialto see that in our speciﬁc context, network analysis of a large number of web sites can produceuseful information about the most important mathematicians. Of course, there is much moreinformation in these biographies than we use here and exploiting additional information is likelyto improve the accuracy of the analysis, especially for less important mathematicians.Our focus on data derived from crowdsourced biographies on Wikipedia means we shouldconsider wider issues often raised in such a context. Other studies have looked at the accuracyof the information in Wikipedia, such as Giles (2005); Warncke-Wang et al. (2013); Chesney(2006); Wilkinson and Huberman (2007), and the results seem generally positive. Our use of asimple noise model makes some allowance for this issue. Issues over gender bias (for examplesee Wagner et al. (2015)) or cultural bias (e.g. see Eom et al. (2015)) may well be relevant herebut we do not study them in any detail. Since many of our most important mathematiciansare historical ﬁgures , there is a further complication in that it may be hard to untangleinherent bias in the editors of Wikipedia biographies of historical mathematicians, from thebias present in the intermediary sources, such as those cited in the Wikipedia pages, or indeedbiases inherent in societies in which these mathematicians lived.For instance it is very obvious when looking at our data that there are very few Asianmathematicians in our results. Is this a reﬂection of their true lack of inﬂuence on modernmathematics? One might hope that practical necessity or a greedy enthusiasm for greaterknowledge, power, wealth can sometimes overcome cultural tensions. Certainly examples suchas the inﬂuence of Arab mathematics and astronomy on Western science contains many positiveexamples of this.One aspect of our work makes us particularly vulnerable to cultural diﬀerences is our focuson individuals. If the original source of mathematical innovation was lost, deliberately or This is one reason why modern citation analysis cannot be used for our study.

Acknowledgement

TSE would like to thank the many colleagues with whom he has worked on earlier studiesof the MacTutor History of Mathematics archive (O’Connor and Robertson, 2017) data. Inparticular TSE thanks C.Clarke 2011 and N.Hopkins 2011 whose work and reports providedthe information on centrality measurements of MacTutor referred to in this paper.26 eferences

Aragón, P., Kaltenbrunner, A., Laniado, D., Volkovich, Y., Apr. 2012. Biographical social net-works on Wikipedia - a cross-cultural study of links that made history. In: WikiSym ’12Proceedings of the Eighth Annual International Symposium on Wikis and Open Collabora-tion.Bavelas, A., 1950. Communication patterns in task-oriented groups. The Journal of the Acous-tical Society of America 22 (6), 725–730.Bentley, R. A., Lipo, C. P., Herzog, H. A., Hahn, M. W., May 2007. Regular rates of popularculture change reﬂect random copying. Evolution and Human Behavior 28 (3), 151–158.URL http://dx.doi.org/10.1016/j.evolhumbehav.2006.10.002

Berlinghoﬀ, W. P., Gouvea, F. Q., 2004. Math through the ages : a gentle history for teach-ers and others, expanded edition Edition. Oxton House Publishers Farmington, Me. andMathematical Association of America, Washington, D.C.Brandes, U., May 2008. On variants of shortest-path betweenness centrality and their genericcomputation. Social Networks 30 (2), 136–145.Brandes, U., Hildenbrand, J., Nov 2014. Smallest graphs with distinct singleton centers. Net-work Science 2 (03), 416–418.URL http://dx.doi.org/10.1017/nws.2014.25

Brin, S., Page, L., 1998. The anatomy of a large-scale hypertextual web search engine. Computernetworks and ISDN systems 30 (1-7), 107–117.Bruggemann, R., Münzer, B., Halfon, E., 1994. An algebraic/graphical tool to compare ecosys-tems with respect to their pollution — the German river “Elbe” as an example - I: hasse-diagrams. Chemosphere 28, 863–872.Chen, B., Lin, Z., Evans, T. S., 2018. The Wikipedia network of mathematicians.URL http://dx.doi.org/10.6084/m9.figshare.5410981

Chesney, T., 2006. An empirical examination of Wikipedia’s credibility. First Monday 11 (11).Clarke, C., 2011. The network of mathematical innovation. Master’s thesis, Imperial CollegeLondon.de Nooy, W., Mrvar, A., Batagelj, V., 2005. Exploratory Social Network Analysis with Pajek.Structural Analysis in the Social Sciences (No. 27). Cambridge University Press.Ekenstierna, G. H., Lam, V. S.-M., 2016. Extracting scientists from Wikipedia. In: Digital Hu-manities 2016. From Digitization to Knowledge 2016: Resources and Methods for SemanticProcessing of Digital Works/Texts, Proceedings of the Workshop, July 11, 2016, Krakow,Poland. No. 126 in Linköping Electronic Conference Proceedings. Linköping University Elec-tronic Press, pp. 13–20.Eom, Y.-H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., Shepelyansky, D. L., 2015.Interactions of cultures and top people of Wikipedia from ranking of 24 language editions.PloS one 10 (3), e0114825.Evans, T., Giometto, A., 2011. Turnover rate of popularity charts in neutral models. Tech. rep.,Imperial College London.URL http://arxiv.org/abs/1105.4044 http://arXiv.org/abs/1506.06580

Hagberg, A., Swart, P., S Chult, D., 2008. Exploring network structure, dynamics, and functionusing networkx. Tech. rep., Los Alamos National Laboratory (LANL).Hopkins, N., 2011. The network of mathematical innovation. Master’s thesis, Imperial CollegeLondon.Jatowt, A., Kawai, D., Tanaka, K., 2016. Digital history meets Wikipedia: Analyzing historicalpersons in Wikipedia. In: Proceedings of the 16th ACM/IEEE-CS on Joint Conference onDigital Libraries. JCDL ’16. ACM, New York, NY, USA, pp. 17–26.URL http://doi.acm.org/10.1145/2910896.2910911

Langville, A. N., Meyer, C. D., 2012. Who’s no.1?: The science of rating and ranking. PrincetonUniversity Press.Loach, T., Evans, T., 2017. Ranking journals using altmetrics. ﬁgshare.com.URL http://dx.doi.org/10.6084/m9.figshare.1461693

Newman, M., 2010. Networks: an introduction. Oxford University Press.O’Connor, J. J., Robertson, E. F., 2017. Mactutor history of mathematics archive.Rice, A., 2015. Ivor Grattan-Guinness (23 June 1941 – 12 December 2014). BSHM Bulletin:Journal of the British Society for the History of Mathematics 30, 94–101.Schoch, D., 2016. Periodic table of network centrality.URL http://schochastics.net/sna/periodic.html

Schoch, D., Valente, T. W., Brandes, U., 2017. Correlations among centrality indices and aclass of uniquely ranked graphs. Social Networks 50, 46–54.Stedall, J. A., 2012. The History of Mathematics: A Very Short Introduction. Oxford UniversityPress.Valente, T. W., Coronges, K., Lakon, C., Costenbader, E., 2008. How correlated are networkcentrality measures? Connections (Toronto, Ont.) 28 (1), 16.URL

Wagner, C., Garcia, D., Jadidi, M., Strohmaier, M., 2015. It’s a man’s Wikipedia? assessinggender inequality in an online encyclopedia. In: ICWSM. pp. 454–463.Warncke-Wang, M., Cosley, D., Riedl, J., 2013. Tell me more: An actionable quality modelfor wikipedia. In: Proceedings of the 9th International Symposium on Open Collaboration.ACM, p. 8.Wilkinson, D. M., Huberman, B. A., 2007. Cooperation and quality in Wikipedia. In: Proceed-ings of the 2007 International Symposium on Wikis. ACM, pp. 157–164.28

Appendix

Additional information is provided in this appendix.

A.1 Variance in Degree in Noise Model

Degree σ TheoreticalExperimental

Figure A1: Each cross indicates the standard deviation in degree of one node after simu-lations. The theoretical result that σ ≈ . (cid:112) k orig is compatible with this numerical result asthe linear ﬁt between variance and degree shows (an adjusted-r square value of . ). A.2 Formal Deﬁnitions of Centrality Measures

The closeness c v for a vertex v is deﬁned to be (Bavelas, 1950; Hagberg et al., 2008; Newman,2010) c v = ( |C v | − (cid:80) u ∈C v \ v d ( u, v ) , (A1)where d ( u, v ) is the length of the shortest path between vertex u and some distinct vertex v which is in the same component, C u , as u . Note that we use a standard normalisation using N ,the number of vertices, but this is irrelevant after our rescaling (4).Our formal deﬁnition of betweenness b v of a vertex v is(Freeman, 1977; Brandes, 2008;Hagberg et al., 2008; Newman, 2010) b v = (cid:88) s,t ∈C v σ ( s, t | v ) σ ( s, t ) . (A2)Here C v is the set of vertices of the component containing vertex v, σ ( s, t ) is the number ofshortest paths available from vertex s to t , and σ ( s, t | v ) is the number of shortest paths from s to t which pass through vertex v . This takes account of cases where there are two or moreshortest paths between a pair of nodes s and t .29igenvalue centrality derived from the the adjacency matrix A , which we deﬁne such that A ij is one (zero) if there is a link (no link) from vertex i to vertex j . The Eigenvector centrality fora vertex i is simply the i -th entry of the eigenvector of A associated with the largest eigenvalue(Newman, 2010; Hagberg et al., 2008). We perform our analysis on the largest componentwhich then guarantees a unique value for each node.PageRank is deﬁned in terms of a transfer matrix, T where each entry, T ji represents theprobability of a random walker at Vertex i moving to vertex j at the next time step. So wehave that T ji = 1 s (out) i A ji , where s (out) i = (cid:88) j A ji . (A3)An additional stochastic process also occurs. At each step, with probability α , the randomwalker follows a link chosen at random as given by the transfer matrix T but with probability (1 − α ) the current walk is deemed to end, or equivalently, we follow a new user or a new walkby starting at a randomly chosen vertex. The Markovian matrix G which describes this processis given by G ij = αT ij + (1 − α ) 1 N (A4)where N corresponds to total number of vertices and α is the damping factor, chosen to be α = 0 . in this work. The probability that a random walker is at vertex i in the long-timelimit is proportional to the PageRank for that vertex and this is given by the i -th entry of theeigenvector associated with the largest Eigenvalue of the G . This makes PageRank similar toEigenvector but diﬀerent to the other centrality measures considered in that PageRank probesthe whole network structure using walks of all types. A.3 Additional Results

A.3.1 MacTutor Results ank Degree Closeness Betweenness PageRank O(2nd)Clustering Word Count1 Newton Newton Euclid Euclid Hilbert Euler2 Hilbert Hilbert Newton Newton Newton Galileo3 Euclid Riemann Euler Laplace Euclid Leibniz4 Riemann Euler Riemann Hilbert Riemann Newton5 Euler Euclid Van der Waerden Lagrange Klein Laplace6 Klein Cauchy Weierstrass Euler Euler Nash7 Weierstrass Gauss Hilbert Riemann Weierstrass Ptolemy8 Poincare Klein Dieudonne Gauss Descartes Tait9 Gauss Dirichlet Cartan Henri Klein Leibniz Kepler10 Einstein Laplace Cauchy Aristotle Gauss Aristotle11 Cauchy Lagrange Hardy Cauchy Einstein Lax Anneli12 Lagrange Poincare Leibniz Leibniz Huygens Copernicus13 Laplace Fourier Dirichlet Einstein Lagrange Euclid14 Leibniz Weierstrass Weil Jacobi Aristotle Polya15 Hardy Legendre Fermat Weierstrass Poincare Escher Table A1: Centrality results for the top ﬁfteen mathematicians in the directed network based onthe hyperlinks between biographies on the MacTutor (O’Connor and Robertson, 2017) database,data from 2011. Copy of Table 4 from the appendix of Clarke (2011).31 .3.2 Wikipedia 2013 Results

Figure A2: The top 36 mathematicians (2013 data) by a scaled average of the ﬁve ratings:Degree, Betweenness, Closeness, Eigenvector centrality and PageRank. Mathematician ‘A’ isconnected placed higher than mathematician ‘B’ if each of their ﬁve ratings of ’A’ is higherthan the corresponding rating for ‘B’. The arrow of the line points from the higher to thelower ranked mathematician but only those essential for the logical relationships are shown (aHasse diagram of the corresponding poset, equivalently the transitively reduced form of thecorresponding directed acyclic graph). The shape of a node, size of label and the verticallocation reﬂects the ‘height’ of each node in the corresponding poset (see text for deﬁnition).The colour reﬂects the average scaled rating of each mathematician.32 ame Degree Betweenness Closeness Eigenvector PageRank Average mark RankDavid Hilbert .

39 100 100 88 .

27 85 .

64 92 .

26 1

Isaac Newton

100 69 .

88 90 .

59 100 100 92 .

09 2

John von Neumann .

79 92 .

67 97 .

07 58 .

73 81 .

55 80 .

96 3

Euclid .

35 60 .

63 91 .

19 85 .

62 80 .

55 80 .

07 4

Aristotle .

39 25 .

17 83 .

43 84 .

76 62 .

28 64 .

41 5

Felix Klein .

23 33 .

82 89 . . .

78 61 .

65 6

Leonhard Euler . .

36 88 .

08 55 . .

45 57 .

96 7

Gottfried Wilhelm Leibniz .

26 24 .

55 88 .

27 70 .

74 50 .

29 57 .

02 8

Carl Friedrich Gauss .

14 35 .

62 88 .

86 39 .

64 60 .

16 56 .

28 9

Ivor Grattan-Guinness .

18 38 .

02 92 .

68 70 .

88 37 .

24 56 10

Emmy Noether .

42 38 .

29 92 .

73 40 .

92 52 .

82 55 .

04 11

Bertrand Russell .

22 25 .

77 90 .

22 66 .

09 44 .

69 54 . Georg Cantor .

86 30 .

49 92 .

51 65 .

06 39 .

24 54 .

03 13

Charles Sanders Peirce .

34 26 .

26 90 .

63 61 .

68 39 .

08 51 . Hermann Weyl .

34 34 . .

85 44 .

82 40 . .

68 15

Ptolemy .

78 9 .

82 75 . .

47 49 .

56 47 .

75 16

Norbert Wiener .

97 29 92 .

12 32 .

47 40 .

58 46 .

23 17

Michael Atiyah . .

68 85 .

45 10 .

21 52 .

71 45 .

75 18

Johannes Kepler .

18 15 .

39 81 . .

71 39 .

15 45 .

71 19

Alan Turing .

13 29 .

35 89 .

64 27 .

83 41 .

93 44 .

98 20

Archimedes .

54 9 .

22 77 .

76 53 .

15 39 .

76 44 .

88 21

G. H. Hardy .

29 28 .

45 88 .

73 20 .

51 40 .

96 42 .

79 22

Alfred Tarski .

13 21 . .

42 32 .

26 38 .

13 42 .

41 23

Augustus De Morgan .

09 12 .

96 85 .

58 45 .

63 32 . .

47 24

Christiaan Huygens .

29 10 .

63 82 .

11 43 .

32 35 .

23 41 .

32 25

Galileo Galilei .

97 10 .

73 79 .

68 43 .

75 35 41 .

23 26

George Boole .

09 11 .

72 84 .

77 47 . .

34 40 .

82 27

William Rowan Hamilton .

57 21 .

95 87 .

96 33 .

27 29 .

17 40 .

18 28

Pierre-Simon Laplace .

61 13 .

39 84 . .

12 34 .

67 40 .

02 29

Srinivasa Ramanujan .

77 24 .

26 86 .

36 15 .

43 40 .

06 39 .

78 30

Nicolaus Copernicus .

29 6 .

27 77 .

25 43 .

37 32 .

53 38 .

94 31

Pierre de Fermat .

05 12 .

17 86 .

16 42 .

04 23 .

67 38 .

02 32

Josiah Willard Gibbs .

05 13 .

41 89 .

22 33 .

77 25 .

85 37 .

66 33

Lejeune Dirichlet .

93 8 .

37 83 .

54 32 .

29 30 .

11 37 .

25 34

Apollonius of Perga .

93 6 .

81 78 .

37 40 .

02 28 .

78 37 .

18 35

Table A2: Centrality scores for top 35 mathematicians from 2013 data (without noise) givenon a common scale with 100 for the largest value according to (4). Ordered in terms of theiraverage score rating. 33 .3.3 Wikipedia 2017 Results

Rank sorted by Degree R a n k s o r t e d b y C l o s e n e ss Isaac NewtonDavid HilbertEuclidJohn von NeumannFelix KleinAristotleLeonhard EulerCarl Friedrich GaussGottfried Wilhelm LeibnizPtolemyEmmy NoetherBertrand RussellGalileo GalileiArchimedesHermann WeylJohannes KeplerAlexander GrothendieckGeorg CantorNicolas BourbakiMichael AtiyahIvor Grattan-GuinnessAlfred TarskiChristiaan HuygensGaetano FicheraNorbert WienerG. H. HardyAlan TuringCharles Sanders PeirceNicolaus CopernicusVladimir ArnoldAndrey KolmogorovSaunders Mac LaneApollonius of PergaKarl WeierstrassRonald Fisher

Figure A3: A comparison of the rank of mathematicians by degree and by closeness. the top35 mathematicians by their average score in the 2017 Wikipedia data are shown under diﬀerentcentrality measures.

Rank sorted by Degree R a n k s o r t e d b y B e t w ee nn e ss Isaac NewtonDavid HilbertEuclidJohn von NeumannFelix KleinAristotleLeonhard EulerCarl Friedrich GaussGottfried Wilhelm LeibnizPtolemyEmmy NoetherBertrand RussellGalileo GalileiArchimedesHermann WeylJohannes KeplerAlexander GrothendieckGeorg CantorNicolas BourbakiMichael AtiyahIvor Grattan-GuinnessAlfred TarskiChristiaan HuygensGaetano FicheraNorbert WienerG. H. HardyAlan TuringCharles Sanders PeirceNicolaus CopernicusVladimir ArnoldAndrey KolmogorovSaunders Mac LaneApollonius of PergaKarl WeierstrassRonald Fisher

Figure A4: A comparison of the rank of mathematicians by degree and by betweenness. Thetop 35 mathematicians by their average score in the 2017 Wikipedia data are shown underdiﬀerent centrality measures. 34 .3.4 Wikipedia 2018 Results

Quantity 2013 2018 % 2018Increase After RewiringMathematicians/Vertices +37.4%

Hyperlinks +49.9%

Undirected Edges +47.3% . ± . Average Degree .

21 3 . +7.2% . ± . Vertices in largest component +42.6% . ± . Edges in largest component +47.4% . ± . Average Degree in largest component .

71 4 . +2.7% . ± . Network Diameter

13 15 +15.3% . ± . Average Path Length .

07 5 . +1.4% . ± . Clustering Coeﬃcient .

13 0 . -7.7% . ± . Table A3: Network parameters for the 2013 and 2018 dataset, the percentage change between2013 and 2017 data, and the mean values found for an ensemble of 1000 rewired 2018 data sets(with one standard deviation uncertainty quoted) as deﬁned by our noise model of Section 2.3with p = 0 . . Degree10 P r o b a b ili t y Degree10 P r o b a b ili t y Figure A5: On the left, the degree distribution for the 2018 network of mathematicians. Onthe right the data is binned (using log binning with the ratio of consecutive bin edges set to be1.5) and a best ﬁt straight line to this data is shown added (slope is − . ± . ).35 s aa c N e w t o n D a v i d H il b e r t E u c li d J o hn v o n N e u m a nn F e li x K l e i n A r i s t o t l e L e o nh a r d E u l e r C a r l F r i e d r i c h G a u ss P t o l e m y B e r t r a n d R u ss e ll D e g r ee Original measurementExperimental measurement

Figure A6: Degree distribution for the ten mathematicians whose Wikipedia biographies havethe largest degree in the 2018 data (crosses). The circles give the mean degree for the samemathematicians as measured over 1000 simulations using our noise model of Section 2.3 wherethe error bars are speciﬁed by one standard deviation.36 s aa c N e w t o n D a v i d H il b e r t E u c li d J o hn v o n N e u m a nn F e li x K l e i n A r i s t o t l e L e o nh a r d E u l e r C a r l F r i e d r i c h G a u ss P t o l e m y B e r t r a n d R u ss e ll R a n k Figure A7: Whisper box plot for degree rank of mathematicians from 1000 simulations of ournoise model from Section 2.3 applied to the 2018 data. The lower and upper edges of blue boxshow the percentile ( Q ) and the percentile ( Q ) of the rank of each mathematician, thered line in the middle of the box is the median. Given the small variation here, these lines oftencoincide. The black lines, at the end of the whiskers connected to the box, are deﬁned to be at Q − . Q − Q ) and Q + 1 . Q − Q ) . The remaining black crosses beyond the whiskersindicate outliers beyond the whiskers. D a v i d H il b e r t J o hn v o n N e u m a nn H e r m a nn W e y l E mm y N o e t h e r N o r b e r t W i e n e r G e o r g C a n t o r B e r t r a n d R u ss e ll I s aa c N e w t o n I v o r G r a tt a n - G u i nn e ss E u c li d R a n k Figure A8: Whisper box plot for the rank of mathematicians by closeness, for the ten mathe-maticians with largest closeness. The closeness centrality is calculated for the largest componentof the 2018 data and the uncertainties are estimated using 1000 simulations using the noisemodel of Section 2.3 with p = 0 . . The criteria used to place the boxes and other features ofthe plot are as in Fig. 3. 37 o hn v o n N e u m a nn D a v i d H il b e r t I s aa c N e w t o n E u c li d C a r l F r i e d r i c h G a u ss N i c o l a s B o u r b a k i L e o nh a r d E u l e r E mm y N o e t h e r F e li x K l e i n A n d r e y K o l m o g o r o v R a n k Figure A9: Whisper box plot for rank by betweenness of the ten mathematicians with highestbetweenness. This is for the largest component of the 2018 data based on 1000 simulationsusing the noise model of Section 2.3. The criteria used to place the boxes and other features ofthe plot are as in Fig. 3. I s aa c N e w t o n D a v i d H il b e r t E u c li d A r i s t o t l e G o tt f r i e d W il h e l m L e i b n i z B e r t r a n d R u ss e ll I v o r G r a tt a n - G u i nn e ss J o hn v o n N e u m a nn G e o r g C a n t o r L e o nh a r d E u l e r R a n k Figure A10: Whisper box plot for rank of mathematicians derived from their Eigenvalue cen-trality. This is for the largest component of the 2017 data based on 1000 simulations using thenoise model of Section 2.3. The criteria used to place the boxes and other features of the plotare as in Fig. 3. 38 s aa c N e w t o n D a v i d H il b e r t J o hn v o n N e u m a nn E u c li d F e li x K l e i n L e o nh a r d E u l e r A r i s t o t l e C a r l F r i e d r i c h G a u ss E mm y N o e t h e r P t o l e m y R a n k Figure A11: Whisper box plot for rank of mathematicians derived from their PageRank ratings.This is for the largest component of the 2018 data based on 1000 simulations using the noisemodel of Section 2.3. The criteria used to place the boxes and other features of the plot are asin Fig. 3. R a n k s DegreeBetweennessClosenessPageRankEigenvector

Figure A12: A comparison of the rank of mathematicians under diﬀerent centrality measures.The horizontal axis is the rank of each mathematician by their average score; the top 35 areshown. Note that as the rank gets higher, there is a small but increasing variation in the ranksby diﬀerent centrality measures for each mathematician.39

20 40 60 80 100 120 140Degree0510152025 ExperimentalTheoretical

Figure A13: Each cross indicates the standard deviation in degree of one node after simulations for top 35 mathematician . The theoretical result that σ ≈ . (cid:112) k orig is compatiblewith this numerical result as the linear ﬁt between variance and degree shows (an adjusted-rsquare value of . ). 40 ame Degree Betweenness Closeness Eigenvector PageRank Average mark RankIsaac Newton

100 77 .

81 92 .

46 100 100 99 .

84 1

David Hilbert

92 91 .

76 100 87 .

29 94 .

15 91 .

99 2

Euclid . .

24 92 .

32 84 .

88 83 .

27 86 .

28 3

John von Neumann . .

19 87 .

28 62 .

25 80 .

35 4

Felix Klein . .

65 71 .

48 55 .

62 73 .

95 5

Aristotle . .

93 85 .

24 62 .

14 76 .

55 66 .

78 6

Leonhard Euler . .

89 91 .

35 68 .

52 60 .

06 65 .

78 7

Carl Friedrich Gauss . .

41 92 .

02 58 .

59 49 .

02 56 .

26 8

Ptolemy .

82 8 .

82 75 .

29 48 .

67 39 .

15 52 .

05 9

Bertrand Russell .

09 29 .

41 92 .

47 47 .

46 69 .

87 51 .

27 10

Emmy Noether .

36 43 .

82 93 .

62 50 . .

41 50 .

58 11

Gottfried Wilhelm Leibniz .

36 24 . .

43 48 .

36 70 .

02 50 .

49 12

Galileo Galilei .

91 16 .

45 82 .

78 46 .

02 53 .

27 49 .

02 13

Archimedes .

45 9 .

45 80 .

31 41 .

72 55 .

51 47 .

66 14

Hermann Weyl .

26 38 .

98 94 .

93 44 . .

74 45 .

35 15

Michael Atiyah .

34 33 .

76 87 .

89 46 .

76 11 .

71 42 .

68 16

Johannes Kepler .

61 13 .

35 80 .

79 40 .

03 43 .

29 41 .

73 17

G. H. Hardy .

88 35 .

26 90 .

28 45 .

61 22 .

92 41 .

14 18

Georg Cantor .

88 26 .

24 92 .

48 37 .

35 61 .

61 41 .

01 19

Alfred Tarski .

15 21 .

44 86 .

52 40 .

63 32 .

19 40 . Nicolas Bourbaki .

15 43 .

93 91 .

82 42 .

87 22 .

58 40 .

25 21

Alexander Grothendieck .

15 25 .

73 86 .

29 42 .

31 10 .

82 40 .

25 22

Alan Turing .

69 23 .

23 87 .

94 40 .

73 29 .

29 38 .

98 23

Ivor Grattan-Guinness .

69 25 .

75 92 .

44 34 .

33 66 . .

63 24

Andrey Kolmogorov .

96 40 .

01 90 . .

82 23 . .

15 25

Charles Sanders Peirce .

23 21 .

76 89 .

98 34 .

98 54 .

02 37 .

35 26

Christiaan Huygens . .

74 85 .

01 35 .

37 47 . .

63 27

Norbert Wiener . .

58 92 .

69 39 . .

02 36 . Richard Courant .

04 25 .

76 89 .

47 38 .

19 23 .

77 35 .

25 29

Emil Artin .

58 28 .

58 89 .

88 37 .

22 21 .

58 33 .

67 30

Vladimir Arnold .

12 39 .

89 89 .

31 41 .

16 19 .

61 32 .

29 31

Bernhard Riemann .

12 21 .

79 91 .

61 29 .

56 41 .

88 32 .

21 32

Srinivasa Ramanujan .

39 25 .

22 88 .

38 36 .

97 16 .

17 31 .

46 33

Alfred North Whitehead .

01 14 .

23 87 .

37 25 .

73 42 .

95 27 .

07 34

Pierre de Fermat .

28 13 .

53 87 .

89 23 .

31 46 .

77 26 .

47 35

Table A4: Centrality scores for top 35 mathematicians from 2018 data (without noise) givenon a common scale with 100 for the largest value according to (4). Ordered in terms of theiraverage score rating. 41 ame Degree Betweenness Closeness Eigenvector PageRank Average RankIsaac Newton . ± .

87 88 . ± .

12 95 . ± .

37 99 . ± .

39 95 . ± .

39 96 . ± .

96 1

David Hilbert . ± .

68 93 . ± .

34 100 ± .

04 87 . ± .

95 94 . ± .

11 93 . ± .

93 2

John von Neumann . ± .

33 94 . ± .

09 97 . ± .

12 84 . ± . . ± .

57 84 . ± . Euclid . ± .

51 65 . ± .

88 94 . ± .

23 86 . ± .

84 79 . ± .

73 82 . ± .

22 4

Felix Klein . ± .

04 51 . ± .

14 93 . ± .

25 72 . ± .

18 57 . ± .

16 69 . ± . Leonhard Euler . ± .

77 48 ± .

62 92 . ± .

33 68 . ± .

07 57 . ± .

81 66 . ± .

53 6

Aristotle . ± .

68 37 . ± .

86 90 . ± .

47 64 . ± .

86 70 . ± .

24 65 . ± .

53 7

Carl Friedrich Gauss . ± .

44 43 . ± .

98 92 . ± .

27 57 . ± .

68 47 . ± .

11 59 . ± .

19 8

Bertrand Russell . ± .

19 33 . ± .

39 93 . ± .

24 48 . ± .

17 64 . ± .

17 58 . ± .

11 9

Gottfried Wilhelm Leibniz . ± .

17 30 . ± .

76 91 . ± .

21 48 . ± .

27 64 . ± .

87 57 . ± .

98 10

Emmy Noether . ± . . ± .

51 94 . ± .

22 49 . ± .

25 44 . ± .

22 56 . ± .

26 11

Hermann Weyl . ± .

88 37 . ± .

68 94 . ± .

13 44 . ± .

96 49 . ± . . ± .

93 12

Georg Cantor . ± .

85 26 . ± .

37 92 . ± .

13 38 . ± .

81 57 . ± .

61 51 . ± .

76 13

Galileo Galilei . ± .

09 23 . ± .

21 87 . ± .

48 47 . ± .

19 48 . ± .

44 51 . ± .

78 14

Ivor Grattan-Guinness . ± .

63 25 . ± .

61 93 . ± .

13 34 . ± .

52 61 . ± .

47 50 . ± .

71 15

Archimedes . ± .

13 17 . ± .

89 86 . ± .

61 43 . ± .

06 50 . ± .

92 49 . ± .

87 16

Ptolemy . ± .

15 18 . ± .

93 83 . ± .

91 50 . ± .

22 36 . ± .

01 48 . ± .

77 17

Charles Sanders Peirce . ± .

53 23 . ± .

12 91 . ± .

24 35 . ± .

52 49 . ± .

34 47 . ± .

57 18

Nicolas Bourbaki . ± . . ± .

35 91 . ± .

24 41 . ± .

05 25 . ± . . ± .

74 19

G. H. Hardy . ± .

71 34 . ± .

84 90 . ± .

35 44 . ± .

98 24 . ± .

64 46 . ± .

51 20

Andrey Kolmogorov . ± .

75 35 . ± .

91 90 . ± .

34 43 . ± .

15 24 . ± .

44 46 . ± .

52 21

Norbert Wiener . ± .

55 30 . ± . . ± . . ± .

72 34 . ± . . ± .

49 22

Johannes Kepler . ± .

84 18 . ± .

81 86 . ± .

66 40 . ± .

92 39 . ± .

88 45 . ± .

64 23

Michael Atiyah . ± .

78 32 . ± .

96 88 . ± .

46 45 . ± .

01 15 . ± .

14 45 . ± .

53 24

Alfred Tarski . ± .

69 23 . ± .

78 87 . ± .

35 40 . ± .

82 31 . ± .

28 44 . ± .

46 25

Alan Turing . ± .

73 23 . ± .

09 88 . ± .

35 40 . ± .

88 29 . ± . ± .

44 26

Christiaan Huygens . ± .

55 15 . ± .

18 87 . ± .

39 35 . ± .

59 43 . ± .

29 27

Bernhard Riemann . ± .

38 20 . ± .

64 91 . ± .

13 29 . ± .

32 40 . ± .

34 43 . ± .

27 28

Alexander Grothendieck . ± .

78 29 . ± .

98 88 . ± .

57 41 . ± .

92 15 . ± .

07 42 . ± .

53 29

Richard Courant . ± .

47 25 . ± .

06 89 . ± .

25 37 . ± .

67 25 . ± .

48 42 . ± .

27 30

Vladimir Arnold . ± .

32 31 . ± .

48 89 . ± .

49 38 . ± .

77 20 . ± .

33 42 . ± .

31 31

Emil Artin . ± .

47 26 . ± .

12 89 . ± .

31 35 . ± .

67 23 . ± .

55 41 . ± .

35 32

Srinivasa Ramanujan . ± .

39 23 . ± .

43 88 . ± .

34 35 . ± .

69 17 . ± .

09 39 . ± .

14 33

Pierre de Fermat . ± .

09 13 . ± .

89 89 . ± . . ± .

02 42 . ± .

28 39 . ± .

07 34

Alfred North Whitehead . ± .

18 14 . ± .

95 88 . ± .

22 25 . ± .

19 38 . ± .

29 39 . ± .

07 35

Table A5: Centrality scores for top 35 mathematicians derived from the the noise model de-scribed of Section 2.3 applied to the 2018 data with p = 0 . for 1000 simulations. The meanvalue and one standard deviation is quoted for each centrality measure for each mathematician.As the scores for each run are always rescaled so that the largest value is 100, explaining whythe value quoted for any one centrality measure is always less than 100. The column markedaverage gives the average over the ﬁve named centrality measures with associated standarddeviation. Mathematicians are ordered in terms of this average and the ranks given are interms of this average over centrality values.Top 35 Degree PageRank Eigenvector Betweenness Closeness AverageDegree 1.00 0.98 0.75 0.78 0.36 0.96PageRank0.85 0.69 0.90 0.69 0.96