[PDF] AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels

Abstract

Authors of research papers in the fields of mathematics, and other math-heavy disciplines commonly employ the Mathematics Subject Classification (MSC) scheme to search for relevant literature. The MSC is a hierarchical alphanumerical classification scheme that allows librarians to specify one or multiple codes for publications. Digital Libraries in Mathematics, as well as reviewing services, such as zbMATH and Mathematical Reviews (MR) rely on these MSC labels in their workflows to organize the abstracting and reviewing process. Especially, the coarse-grained classification determines the subject editor who is responsible for the actual reviewing process. In this paper, we investigate the feasibility of automatically assigning a coarse-grained primary classification using the MSC scheme, by regarding the problem as a multi-class classification machine learning task. We find that our method achieves an (F_1)-score of over 77%, which is remarkably close to the agreement of zbMATH and MR ((F_1)-score of 81%). Moreover, we find that the method's confidence score allows for reducing the effort by 86% compared to the manual coarse-grained classification effort while maintaining a precision of 81% for automatically classified articles.

Full PDF

PPreprint from

M. Schubotz et al. “AutoMSC: Automatic Assignment of MathematicsSubject Classiﬁcation Labels”. In:

Proceedings of the 13th Conferenceon Intelligent Computer Mathematics . 2020

AutoMSC: Automatic Assignment ofMathematics Subject Classiﬁcation Labels

Moritz Schubotz , Philipp Scharpf , Olaf Teschke , AndreasK¨uhnemund , Corinna Breitinger , and Bela Gipp FIZ-Karlsruhe, Germany ( { ﬁrst.last } @ﬁz-karlsruhe.de) University of Wuppertal, Germany ( { last } @uni-wuppertal.de) University of Konstanz, Germany ( { ﬁrst.last } @uni-konstanz.de)May 26, 2020 Abstract

Authors of research papers in the ﬁelds of mathematics, and othermath-heavy disciplines commonly employ the Mathematics Subject Clas-siﬁcation (MSC) scheme to search for relevant literature. The MSC is ahierarchical alphanumerical classiﬁcation scheme that allows librarians tospecify one or multiple codes for publications. Digital Libraries in Mathe-matics, as well as reviewing services, such as zbMATH and MathematicalReviews (MR) rely on these MSC labels in their workﬂows to organize theabstracting and reviewing process. Especially, the coarse-grained classi-ﬁcation determines the subject editor who is responsible for the actualreviewing process.In this paper, we investigate the feasibility of automatically assigning acoarse-grained primary classiﬁcation using the MSC scheme, by regardingthe problem as a multi class classiﬁcation machine learning task. We ﬁndthat the our method achieves an F -score of over 77%, which is remarkablyclose to the agreement of zbMATH and MR ( F -score of 81%). Moreover,we ﬁnd that the method’s conﬁdence score allows for reducing the eﬀortby 86% compared to the manual coarse-grained classiﬁcation eﬀort whilemaintaining a precision of 81% for automatically classiﬁed articles. zbMATH has classiﬁed more than 135k articles in 2019 using the MathematicsSubject Classiﬁcation (MSC) scheme [6]. With more than 6,600 MSC codes, thisclassiﬁcation task requires signiﬁcant in-depth knowledge of various sub-ﬁelds ofmathematics to determine the ﬁtting MSC codes for each article. In summary, https://zbmath.org/ a r X i v : . [ c s . D L ] M a y he classiﬁcation procedure of zbMATH and MR is two-fold. First, all articlesare pre-classiﬁed into one of 63 primary subjects spanning from general topicsin mathematics (00), to integral equations (45), to mathematics education (97).In a second step, subject editors assign ﬁne-grained MSC codes in their area ofexpertise, i.a. with the aim to match potential reviewers.The automated assignments of MSC labels has been analyzed by Rehurekand Sojka [9] in 2008 on the DML-CZ [14] and NUMDAM [3] full-text cor-pus. They report a micro-averaged F score of 81% for their public corpus. In2013 Barthel, T¨onnies, and Balke performed automated subject classiﬁcation forparts of the zbMATH corpus [2]. They criticized the micro averaged F measure,especially, if the average is applied only to the best performing classes. However,they report a micro-averaged F score of 67 .

1% for the zbMATH corpus. Theysuggested training classiﬁers for a precision of 95% and assigning MSC class la-bels in a semi-automated recommendation setup. Moreover, they suggested tomeasure the human baseline (inter-annotator agreement) for the classiﬁcationtasks. Moreover, they found that the combination of mathematical expressionsand textual features improves the F score for certain MSC classes substantially.In 2014, Sch¨oneberg and Sperber [11] implement a method that combined for-mulae and text using an adapted Part of Speech Tagging approach. Their paperreported a suﬃcient precision of > .

75, however, it did not state the recall. Theproposed method was implemented and is currently being used especially topre-classify general journals [7] with additional information, like references. Fora majority of journals, coarse- and ﬁne-grained codes can be found by statis-tically analyzing the MSC codes from referenced documents matched withinthe zbMATH corpus. The editor of zbMATH hypothesizes that the referencemethod outperforms the algorithm developed by Sch¨oneberg and Sperber. Toconﬁrm or reject this hypothesis was one motivation for this project.The positive eﬀect of mathematical features is conﬁrmed by Suzuki and Fu-jii [16], who measured the classiﬁcation performance based on an arXiv andmathoverﬂow dataset. In contrast, Scharpf et al. [10] could not measure asigniﬁcant improvement of classiﬁcation accuracy for the arxiv dataset whenincorporating mathematical identiﬁers. In their experiments Scharpf et al. eval-uated numerous machine learning methods, which extended [4, 15] in terms ofaccuracy and run-time performance, and found that complex compute-intensiveneural networks do not signiﬁcantly improve the classiﬁcation performance.In this paper, we focus on the coarse-grained classiﬁcation of the primaryMSC subject number (pMSCn) and explore how current machine learning ap-proaches can be employed to automate this process. In particular, we comparethe current state of the art technology [10] with a part of speech (POS) prepro-cessing based system customized for the application in zbMATH from 2014 [11].2 ilter Match zbMATH MR yesno

Train Test

Model Evaluate

Figure 1: Workﬂow overview.We deﬁne the following research questions:1. Which evaluation metrics are most useful to assess the classiﬁcations?2. Do mathematical formulae as part of the text improve the classiﬁcations?3. Does POS preprocessing [11] improve the accuracy of classiﬁcations?4. Which features are most important for accurate classiﬁcation?5. How well do automated methods perform in comparison to a human base-line?

To investigate the given set of problems, we ﬁrst created test and trainingdatasets. We then investigated the diﬀerent pMSCn encodings, trained ourmodels and evaluated the results, cf Figure 1.

Filter current high quality articles:

The zbMATH database has assignedMSC codes to more than 3.6 M articles. However, the way in which mathemat-ical articles are written has changed over the last century, and the classiﬁcationof historic articles is not something we aim to investigate in this article. The ﬁrstMSC was created in 1990, and has since been updated every ten years (2000,2010, and 2020) [5]. With each update, automated rewrite rules are applied tomap the codes from the old MSC to the next MSC version, which is connectedwith a loss of accuracy of the class labels. To obtain a coherent and high qual-ity dataset for training and testing, we focused on the more recent articles from2000 to 2019, which were classiﬁed using the MCS version 2010, and we only3onsidered selected journals . Additionally, we restricted our selection to En-glish articles and limited ourselves to abstracts rather than reviews of articles.To be able to compare methods that are based on references and methods usingtext and title, we only selected articles with at least one reference that couldbe matched to another article. In addition, we excluded articles that were notyet published and processed. The list of articles is available from our website: https://automsceval.formulasearchengine.com Splitting to test and training set:

After applying the ﬁlter criteria as men-tioned above, we split the resulting list of 442,382 articles into test and trainingsets. For the test set, we aimed to measure the bias of our zbMATH classiﬁca-tion labels. Therefore, we used the articles for which we knew the classiﬁcationlabels by the MR service as the training set from a previous research project [1].The resulting test set consisted of n = 32 ,

230 articles, and the training setcontained 410,152 articles. To ensure that this selection did not introduce addi-tional bias, we also computed the standard ten-fold cross validation, cf. Section3.

Deﬁnition of article data format:

To allow for reproducibility, we createda dedicated dataset from our article selection, which we aim to share with otherresearchers. However, currently, legal restrictions apply and the dataset can notyet be provided for anonymous download at this date. However, we can grantaccess for research purposes as done in the past [2]. Each of the 442,382 articlesin the dataset contained the following ﬁelds: de An eight-digit ID of the document . labels The actual MSC codes . title The English title of the document, with LaTeX macros for mathematicallanguage [12]. text

The text of the abstract with LaTeX macros. mscs

A comma separated list of MSC codes generated from the references.These 5 ﬁelds were provided as CSV ﬁles to the algorithms. The mscs ﬁeldwas generated as follows: For each reference in the document, we looked up theMSC codes of the reference. For example, if a certain document contained thereferences

A, B, C that are also in the documents in zbMATH and the MSCcodes of

A, B, C are a and a , b , and c − c , respectively, then the ﬁeld mscs will read a a , b , c c c . After training, we required each of our tested algorithms to return the fol-lowing ﬁelds in CSV format for the test sets: The list of selected journals is available from https://zbmath.org/?q=dt%3Aj+st%3Aj+py%3A2000-2019. The ﬁelds de and labels must not be used as input to the classiﬁcation algorithm. e (integer) Eight-digit ID of the document. method (char(5))

Five-letter ID of the run. pos (integer)

Position in the result list. coarse (integer)

Coarse-grained MSC subject number. ﬁne (char(5), optional)

Fine-grained MSC code. score (numeric, optional)

Self-conﬁdence of the algorithm about the result.We ensured that the ﬁelds de , method and pos form a primary key, i.e., no twoentries in the result can have the same combination of values. Note that forthe current multi-class classiﬁcation problem, pos is always 1, since only theprimary MSC subject number is considered. While the assignment of all MSC codes to each article is a multi-label classiﬁ-cation task, the assignment of the primary MSC subject, which we investigatein this paper, is only a multi-class classiﬁcation problem. With k = 63 classes,the probability of randomly choosing the correct class of size c i is rather low P i = c i n . Moreover, the dataset is not balanced. In particular, the entropy H = − (cid:80) ki =1 P i log P i , can be used to measure the imbalance (cid:98) H = H log k bynormalizing it to the maximum entropy log k. To take into account the imbalance of the dataset, we used weighted versionsof precision p , recall r, and the F measure f . In particular, the precision p = (cid:80) ki =1 c i p i n with the class precision p i . r and F are deﬁned analogously.In the test set, no entries for the pMSCn 97 (Mathematics education) wereincluded, thus (cid:98) H = H log k = 3 . . k = 37 , which only has a minor eﬀecton the normalized entropy as it is raised to (cid:98) H = . . The chosen value of 200can be interactively adjusted in the dynamic result ﬁgures we made availableonline . Additionally, the individual values for P i that were used to calculate H are given in the column p in the table on that page. As one can experiencein the online version of the ﬁgures, the impact on the choice of the minimumclass size is insigniﬁcant. https://autoMSCeval.formulasearchengine.com .3 Selection of methods to evaluate In this paper, we compare 12 diﬀerent methods for (automatically) determiningthe primary MSC subject in the test dataset: zb1

Reference MSC subject numbers from zbMATH. mr1

Reference MSC subject numbers from MR. titer

According to recent research performed on the arXiv dataset [10], wechose a machine learning method with a good trade-oﬀ between speedand performance. We combined the title , abstract text , and reference mscs of the articles via string concatenation. We encoded these stringsources using the

TﬁdfVectorizer of the Scikit-learn python package. Wedid not alter the utf-8 encoding, and did not perform accent striping, orother character normalization methods, with the exception of lower-casing.Furthermore, we used the word analyzer without a custom stop wordlist, selecting tokens of two or more alphanumeric characters, processingunigrams, and ignoring punctuation. The resulting vectors consisted ofﬂoat64 entries with l2 norm unit output rows. This data was passed toOur encoder. The encoder was trained on the training set to subsequentlytransform or vectorize the sources from the test set. We chose a lightweight LogisticRegression classiﬁer from the python package Scikit-learn. We em-ployed the l2 penalty norm with a 10 − tolerance stopping criterion anda 1.0 regularization. Furthermore, we allowed intercept constant additionand scaling, but no class weight or custom random state seed. We ﬁttedthe classiﬁer using the lbfgs (Limited-memory BFGS) solver for 100 con-vergence iterations. These choices were made based on a previous studyin which we clustered arXiv articles. refs Same as titer , but using only the mscs as input . titls Same as titer , but using only the title as input . texts Same as titer , but using only the text as input . tite Same as titer , but without using the mscs as input . tiref : Same as titer , but without using the abstract text as input . teref : Same as titer , but without using the title as input . ref1 We used a simple SQL script to suggest the most frequent primary MSCsubject based on the mscs input. This method is currently used in pro-duction to estimate the primary MSC subject. uT1

We adjusted the JAVA program posLingue [11] to read from the newtraining and test sets. However, we did not perform a new training and https://swmath.org/software/8058 [8] p , recall r and F -measure f with regard to the baseline zb1 (left) and mr1 (right).instead reused the model that was trained in 2014. However, for this run,we removed all mathematical formulae from the title and the abstract text to generate a baseline. uM1 The same as uT1 but in this instance, we included the formulae. Weslightly adjusted the formula detection mechanism, since the way in whichformulae are written in zbMATH had changed [12]. This method is cur-rently used in production for articles that do not have references withresolvable mscs . After executing each of the methods described in the previous section, we cal-culated the precision p , recall r, and F score f for each method, cf. Table 1.Overall, we ﬁnd that results are similar whether we used zbMATH or MR as abaseline in our evaluation. Therefore, we will use zbMATH as the reference forthe remainder of the paper. All data, including the test results using MR as thebaseline is available from: https://automsceval.formulasearchengine.com . Eﬀect of mathematical expressions and part-of-speech tags:

By ﬁlter-ing out all mathematical expressions in the current production method uT1 incontrast to uM1 we could receive information on the impact of mathematicalexpressions on classiﬁcation quality. We found that the overall F score withoutmathematical expressions f uT = 64 .

5% is slightly higher than the score withmathematical expressions f uM = 64 . . Here, the main eﬀect is an increase in Each of these sources was encoded and classiﬁed separately. title and abstract text do not improvethe classiﬁcation quality. Method uT1 =left bar; method uM1 =right barFigure 3: Part-of-speech tagging for mathematics does not improve the classiﬁ-cation quality. Method uM1 =left bar, method tite =right bar.recall from 63 .

9% to 64 . . Additionally, a class-wise investigation showed thatfor most classes, uT1 outperformed uM1 , cf. Figure 2. Exceptions are pMSCn46 (Functional analysis ) and 17 (Nonassociative rings and algebras) where theinclusion of math tags raised the F -score slightly.We evaluated the eﬀect of part of speech tagging (POS), by comparing tite with uM1 . f tite = .

713 clearly outperformed f uM = . . This held true for allMSC subjects, cf. Figure 3. We modiﬁed posLingo to output the POS taggedtext and used this text as input and retrained scikit learn classiﬁer tite2 .However, this method did not lead to better results than tite . Eﬀect of features and human baseline:

The newly developed methodcombined method [10] works best in a combined approach that uses title ,abstract text , and references titer f titer = 77 . . This method performs sig-niﬁcantly better than methods that omit either one of these features. Thebest performing single feature method was refs f refs = 74 . text f text = 69 .

9% and titls f titls = 62 . title (i.e. teref f text = 77%) or abstract text (i.e. tiref f text = 76%), the performanceremained notably higher than when the approach excluded the reference mscs refs , left) clearly outperforms current pro-duction ( ref1 , right) method using references as only source for classiﬁcation.Figure 5: For many pMSCn the best automatic method ( titer , right) gets closeto the performance of the human baseline ( mr1 left)( tite f text = 71 . ref1 f text = 65 . tite despite this approachignoring references. In conclusion, we can say that training a machine learn-ing algorithm that weights all information from the ﬁne grained MSC codes isclearly better than the majority vote of the references, cf. 4.Even the best performing machine learning algorithm, titer with f titer =77 . mr1 f mr = 81 . . However, there is no foundation that could allow us to determinewhich of the primary MSC subjects, either from MR or zbMATH, are truly cor-rect. Assigning a two-digit label to mathematical research papers – which oftencover overlapping themes and topics within mathematics – remains a challenge9igure 6: Confusion matrix titer even to humans, who struggle to conclusively label publications as belonging toonly a single class. While for some classes, expert agreement is very high, e.g.for class 20 agreement is 89 . .

6% regarding the F score, cf., Figure 5. These discrepancies reﬂect theintrinsic problem that mathematics cannot be fully reﬂected by a hierarchicalsystem. The diﬀerences in classiﬁcations made among the two reviewing servicesare likely also a reﬂection of emphasizing diﬀerent facets of evolving research,which often derive from diﬀerences in the reviewing culture.We also investigated the bias introduced by the non-random selection of thetraining set. Performing ten fold cross validation on the entire dataset yieldedan accuracy of f titer, = .

776 with a standard deviation σ titer, = . . Thus,test set selection does not introduce a signiﬁcant bias.After having discussed the strengths and weaknesses of the individual meth-ods tested, we now discuss how the currently best-performing method, titer ,can be improved. One standard tool to analyze misclassiﬁcations is a confu-sion matrix, cf., Figure 6. In this matrix, oﬀ-diagonal elements of the matrixindicate that two sets of classes are often mixed by the classiﬁcation algorithm.The x axis shows the true labels, while the y axis shows the predicted labels.10igure 7: Precision recall curve titer .The most frequent error of titer was that 68 (Computer science) was classiﬁedas 5 (Combinatorics). Moreover, 81 (Quantum theory) and 83 (Relativity andgravitational theory) were often mixed up.However, in general the number of misclassiﬁcations were small and therewas no immediate action that one could take to avoid special cases of misclas-siﬁcation that do not involve a human expert.Since titer outperforms both the text-based and reference based methodscurrently used in zbMATH, we decided to develop a restful API that wrapsour trained model into a service. We use pythons fastAPI under unicorn tohandle higher loads. Our system is available as a docker container and canthus be scaled on demand. To simplify development and testing, we providea static HTML page as a micro UI, which we call

AutoMSC . This UI dis-plays not only lists/suggests the most likely primary MSC subjects but alsothe less likely MSC subjects. We expect that our UI can support human ex-perts, especially whenever the most likely MSC subject seems unsuitable. Theresult is displayed as a pie-chart, cf., Figure 8 from https://automscbackend.formulasearchengine.com . To use the system in practice, an interface to thecitation matching component of zbMATH would be desired to paste the actualreferences rather than the MSC subjects extracted from the references. More-over, looking at the precision-recall curve (Figure 7) for titer , suggests thatone can also select a threshold for falling back to manual classiﬁcation. Forinstance, if one requires a precision that is as high as the precision of the otherhuman classiﬁcations by MR, one would need to only consider suggestions with ascore > .

5. This would automatically classify 86 .

2% of the 135k articles beingannually classiﬁed by subject experts at zbMATH/MR and thus signiﬁcantlyreduce the number of articles that humans must manually examine without aloss of classiﬁcation quality. This is something we might develop in the future.11

Conclusion & Future Work

Returning to our research questions, we summarize our ﬁndings as follows:First, we asked which metrics are best suited to assess classiﬁcation quality.We demonstrated that the classiﬁcation quality for the primary MSC subjectcan be evaluated with classical information retrieval methods such as precision,recall and F -score. We share the observation Barthel, T¨onnies, and Balke [2]that the averages do not reﬂect the performance of outliers, cf. Figures 1-4.However, for our methods the diﬀerence between the best and worst performingclass was signiﬁcantly smaller than reported by [2].Second, we wanted to ﬁnd out whether taking into account the mathematicalformulae contained in publications could improve the accuracy of classiﬁcations.In accordance with [10], we did not ﬁnd evidence that mathematical expressionsimproved pMSCn classiﬁcation. However, we did not evaluate advanced encod-ings of mathematical formulae. This is will be a subject of future work, cf.Figure 1.Third we evaluated the eﬀect of POS-preprocessing [11] and found that mod-ern machine learning methods do not beneﬁt from the POS tagging based modeldeveloped by [11], cf. Figure 2.Fourth we evaluated which features are most important for an accurate clas-siﬁcation. We conclude that references have the highest prediction power, fol-lowed by the abstract text and title.Finally, we evaluated the performance of automatic methods in comparisonto a human baseline. We found that our best performing method has an F score of 77.2%. The manual classiﬁcation is signiﬁcantly better for most classes,cf. Figure 4. However, the self-reported score can be used to reduce the manualclassiﬁcation eﬀort by 86.2%, without a loss in classiﬁcation quality.In the future, we plan to extend our automated methods to predict fullMSC codes. Moreover, we would like to be able to assign pMSCn to documentsections, since we realize that some research just does not ﬁt into one of theclasses. We also plan to extend the application domain to other mathematicalresearch artifacts, such as blog posts, software, or dataset descriptions. As anext step, we plan to generate pMSCn from authors using the same methods weapplied for references. We speculate that authors will have a high impact on theclassiﬁcation, since authors often publish in the same ﬁeld. For this purpose,we are leveraging our prior research on aﬃliation disambiguation, which couldbe used as fallback method for junior authors, who have not yet established atrack record. Another extension is a better combination of the diﬀerent features.Especially when performing research on the full MSC code-generation, we willneed to use a diﬀerent encoding for the MSC from references and authors.However, this new encoding requires more main memory for the training ofthe model and cannot be done on a standard laptop. Thereafter, we will re-investigate the impact of mathematical formulae since the inherently combinedrepresentation of text and formulae was not successful.Our work represents a further step in the automation of Mathematics Sub-ject Classiﬁcation and can thus support reviewing services, such as zbMATH Mathematical Reviews . For accessible exploration, we have made the best-performing approaches available in our

AutoMSC implementation and haveshared our code on our website. We envision that other application domainsrequiring an accurate labeling of publications into their respective MathematicsSubject Classiﬁcation, for example, research paper recommendation systems, orreviewer recommendation systems, will also be able to beneﬁt from this work.AutoMSC delivers comparable results to human experts in the ﬁrst stage ofMSC labeling, all without requiring manual labor or trained experts. In thefuture, zbMATH will use our new method for all journals that used to employthe method by Sch¨oneberg and Sperber [11] introduced in 2014.

Acknowledgments:

This work was supported by the German Research Foun-dation (DFG grant GI 1259-1). The authors would like to express their gratitudeto Felix Hamborg, and Terry Ruas for their advice in the most recent machinelearning technology.

References [1] A. Bannister et al. “Editorial: On the Road to MSC 2020”. In:

EMSNewsletter doi : .[2] S. Barthel, S. T¨onnies, and W. Balke. “Large-Scale Experiments for Math-ematical Document Classiﬁcation”. In: Proc. Digital Libraries: Social Me-dia and Community Networks, ICADL 2013 . Vol. 8279. Springer, 2013,pp. 83–92. doi : .[3] T. Bouche and O. Labbe. “The New Numdam Platform”. In: Proc. CICM .Ed. by H. Geuvers et al. Vol. 10383. Springer, 2017, pp. 70–82. doi : .[4] I. Evans. “Semi-supervised topic models applied to mathematical docu-ment classiﬁcation”. PhD thesis. University of Bath, Somerset, UK, 2017.135] P. Ion and W. Sperber. “MSC 2010 in SKOS – the transition of the MSCto the semantic web.” In: Eur. Math. Soc. Newsl.

84 (2012), pp. 55–57.[6] A. K¨uhnemund. “The role of applications within the reviewing servicezbMATH”. In:

PAMM doi :

10 . 1002 /pamm.201610459 .[7] H. Mihaljevi´c-Brandt and O. Teschke. “Journal proﬁles and beyond: whatmakes a mathematics journal “general”?” English. In:

Eur. Math. Soc.Newsl.

91 (2014), pp. 55–56.[8] F. Pedregosa et al. “Scikit-learn: machine learning in Python.” English.In:

J. Mach. Learn. Res.

12 (2011), pp. 2825–2830.[9] R. Rehurek and P. Sojka. “Automated Classiﬁcation and Categorizationof Mathematical Knowledge”. In:

Proc. CICM . Ed. by S. Autexier et al.Vol. 5144. Springer, 2008, pp. 543–557. doi : .[10] P. Scharpf et al. “Classiﬁcation and Clustering of arXiv Documents, Sec-tions, and Abstracts Comparing Encodings of Natural and MathematicalLanguage”. In: Proc. ACM/IEEE JCDL . 2020.[11] U. Sch¨oneberg and W. Sperber. “POS Tagging and Its Applications forMathematics - Text Analysis in Mathematics”. In:

Proc. CICM . 2014. doi : .[12] M. Schubotz and O. Teschke. “Four decades of TeX at zbMATH.” English.In: European Mathematical Society Newsletter

112 (2019), pp. 50–52.[14] P. Sojka and R. Rehurek. “Classiﬁcation of Multilingual MathematicalPapers in DML-CZ”. In:

Proc. The 1st Workshop on Recent Advances inSlavonic Natural Languages Processing, RASLAN 2007 . Masaryk Univer-sity, 2007, pp. 89–96.[15] P. Sojka et al. “Quo Vadis, Math Information Retrieval”. In:

The 13thWorkshop on Recent Advances in Slavonic Natural Languages Process-ing, RASLAN 2019, Karlova Studanka, Czech Republic, December 6-8,2019 . Ed. by A. Hor´ak, P. Rychl´y, and A. Rambousek. Tribun EU, 2019,pp. 117–128.[16] T. Suzuki and A. Fujii. “Mathematical Document Categorization withStructure of Mathematical Expressions”. In: . IEEE Computer Society, 2017, pp. 119–128. doi :

10 . 1109 /JCDL.2017.7991566 . 14isting 1: Use the following

BibTeX code to cite this article @inproceedings { Schubotz2020b ,author = { Moritz Schubotz and Philipp Scharpf and OlafTeschke and Andreas K \" uhnemund and Corinna Breitingerand Bela Gipp },title = { AutoMSC : Automatic Assignment of MathematicsSubject Classification Labels },booktitle = { Proceedings of the 13 th Conference onIntelligent Computer Mathematics },date = 2020 ,}@inproceedings { Schubotz2020b ,author = { Moritz Schubotz and Philipp Scharpf and OlafTeschke and Andreas K \" uhnemund and Corinna Breitingerand Bela Gipp },title = { AutoMSC : Automatic Assignment of MathematicsSubject Classification Labels },booktitle = { Proceedings of the 13 th Conference onIntelligent Computer Mathematics },date = 2020 ,}