Assessing the Readability of Policy Documents on the Digital Single Market of the European Union
AAssessing the Readability of Policy Documents onthe Digital Single Market of the European Union
Jukka Ruohonen
University of Turku, FinlandEmail: juanruo@utu.fi
Abstract —Today, literature skills are necessary. Engineeringand other technical professions are not an exception from thisrequirement. Traditionally, technical reading and writing havebeen framed with a limited scope, containing documentation,specifications, standards, and related text types. Nowadays,however, the scope covers also other text types, including legal,policy, and related documents. Given this motivation, this paperevaluates the readability of 201 legislations and related policydocuments in the European Union (EU). The digital single market(DSM) provides the context and five classical readability indicesthe methods. The empirical results indicate that (i) generally aPh.D. level education is required to comprehend the DSM lawsand policy documents. Although (ii) the results vary across thefive indices used, (iii) readability has slightly improved over time.
Index Terms —readability index, comprehension, literatureskills, text mining, legal texts, law, digital single market, EU
I. I
NTRODUCTION
Comprehension of different texts is a requirement of today’slife. No matter of their background, citizens need to understandmany text types to participate in a society and manage theirlives. At the same time, surveys around the world havereported declines in literature skills, generally defined not onlyas the ability to read and write but to also understand, evaluate,and use written texts [1]. There are many reasons behind thedeclines. Technology is one factor. Although existing resultsare not definite [2], smart phones and social media are amongthe technological factors partially explaining the declines.When the declines are coupled with other trends, such astoday’s avalanches of misinformation and disinformation, thesocietal consequences may be ruinous in the long-run.Technical professions do not exempt from literature skills.Even technically highly specialized jobs—from those in pro-gramming to those in mechanical engineering—require thecomprehension of different texts. Programmers need to writedocumentation and understand the documentation of otherprogrammers. Engineers need to read and write technicalspecifications. Thus, language and literature skills, criticalthinking, and related “humanistic” skills have long been onthe agenda in engineering education programs [3]. There areeven online courses specifically for technical writing skills [4].Although the successes of these programs and courses may bedebatable, the skill requirements have not disappeared. In fact,these have likely increased. A good example would be law.Many engineers need to nowadays comprehend also legaldocuments, particularly in case legal counseling is not attain- able, as is often the case in start-ups and small companies.To this and other ends, educators have long tried to exposestudents to different text types [5]. To some extent, theattempts have beared fruit. For instance, anecdotal evidencehints that engineering and computer science students are recep-tive to the grammar and vocabulary of law [6]. Yet questionsremain whether and how well they are able to understand andtranslate law into technical specifications and implementations.In practice, collaboration between layers and engineers is oftenrequired for such translations [7]. To ease the translations,different formalization methods and tools have long beenproposed [8]. Eventually, such solutions may eliminate theneed to have lawyers in situations requiring the engineeringof requirements originating from law. But computer scientists,engineers, and other technical professions are obviously notthe only ones required to read, interpret, and understand law.The comprehension of law by ordinary citizens remains themost pressing issue. And existing results are not encouraging.For instance, even many legal counseling websites have beenobserved to be beyond the comprehension of those with weakliteracy skills [9]. The problem can be argued to contain twoparts. The first is intentional obfuscation. And, indeed, it isnot difficult to find arguments that companies and their legalrepresentatives deliberately obfuscate documents in order toevade transparency requirements [10]. Nor is public sectoradministration immune to such arguments. The second partderives from the ways lawyers, academics, and other profes-sionals, including legislators, communicate in writing. Thetopics covered by them are often complex and hence thewriting tends to be complex. Sometimes, though, complexityonly serves complexity. Lawyers, for instance, have long beenaccused of writing gobbledygook [11]. Other professionals,including civil servants, are often no better in this regard.Therefore, simple and understandable writing has long beenseen as a part of good administration. The same applies tolaw-making for which language improvements have long beenrecommended [12]. This point motivates to ask a researchquestion (RQ ): how readable are the DSM-specific laws andpolicy documents? The framing to the DSM can be justifiedwith the earlier remarks on technical professions; these arethe laws and policy documents engineers specifically shouldbe able to understand. The second research question (RQ ) isabout validity: are there statistical differences between the fivereadability indices used? The final RQ is longitudinal: hasthe readability of the laws and documents improved over time ? a r X i v : . [ c s . C Y ] F e b I. R
ELATED W ORK
The paper shares a long tradition of related work. Read-ability first became a research topic already in the 1920s.Ever since, different quantitative indices have been proposedto gauge the perceived readability of a text. The ideal hasbeen simplicity. Plain language, plain prose, or plain Englishhave been the terms used to describe this ideal. It has beenendorsed by authors of both fact and fiction. Actually, theclassical assessments and opinions of both author types havebeen highly similar, as becomes evident by comparing the1946 works of Flesch [13] and Orwell [14], respectively. Theformer author also developed a quantitative index for his ideal.From the early 1970s onward, the Flesch’s classical read-ability index was met by multiple competing indices. Yet thesecompeting indices never abandoned the ideal of simplicity.Simple is usually better when a topic is complex. Therefore,a classical application domain has been technical writing byengineers [15], [16]. But the indices have been also applied formany other purposes. Other application domains include theevaluation of financial reports [17], websites [18] and their pri-vacy policies [19], tweets by politicians who speak at the levelof 4–5:th graders [20], fake online reviews [21], and consentforms for scientific research [22], to name some examples.Some previous work exists also for using indices to evaluatethe readability laws and related legal documents. The resultshave not been surprising. The comprehension requirementshave been observed to be beyond the educational attainment ofmost people—and particularly of those who would most needinformation about their rights [23], [24]. Even the instructionsdelivered to juries in common law legal systems have beenmeasured to be beyond the literature skills of many adults [25].The decisions reached by some courts have also become moredifficult to read over time, which, as such, reflects the increas-ing complexity of many legal questions [26]. However, noprevious research seems to exists in the EU context accordingto a reasonable literature search. Regarding the engineeringcontext, it is also worth remarking the work that has focused onthe understandability of the general logic of law [27]. But it isdifficult to understand a logic when a text describing the logicis difficult to understand. This point provides a justificationfor evaluating the overall readability of legislations in the EU.III. D
ATA
The dataset used covers the primary legal and policy doc-uments on the digital single market of the EU. The wordprimary is used to emphasize that not all documents areincluded. The reason is simple: there exists no single databasethat would cover everything about the DSM. But, fortunately,the EU has assembled a portal that provides summaries aboutkey topical policy areas. The area reserved for the DSM wasused to assemble the dataset by covering all documents exceptarchival material. Based on this portal [28], five domains ofthe DSM are present: general rules, electronic communicationnetworks, personal data and privacy, copyright and audiovisualmaterial, and data economy and data protection. The datacollection resembled the so-called snowball sampling: for each domain, all hyperlinks were visited, and the documentsmentioned on the web pages were collected as long as thesewere directives, regulations, or communications (COM), staffworking documents (SWD), recommendations, decisions, orjoint declarations (JOIN) of the European Commission or theCouncil. Hence, references to informal documents, treaties,corrigenda, court cases, decisions of non-legislative EU institu-tions, and related document types were excluded. Both currentand deprecated legislations were included in the dataset butonly insofar as these were explicitly linked on the web pages.The individual legislations and policy documents are enu-merated in Fig. 1. Of these, about 34% are decisions, 32%directives, 18% regulations, and the remaining communica-tions, recommendations, and other document types. The highamount of decisions is partially explained by the specificationsfor frequency bands used for electronic and mobile communi-cation. Based on a subjective classification, as much as 35% ofthe laws and policy documents are about telecommunications.Privacy and data protection (14%), copyright and intellectualproperty (14%), different contracts and justice in general (6%),Internet governance (4%), and cyber security (3%) follow.IV. M
ETHODS
Five classical readability indices are used, as implementedin a Python package [29]. All measure the hypothetical gradelevel required to comprehend a text. The grades are based onthose used in the United States. In theory, there is a upperlimit in the grades—a graduate program in a university wouldbe somewhere around the 22:th grade or so, but none of theindices impose limits. Negative values are also possible. Thus,in general, the higher the score, the more difficult a text isto comprehend. To ensure comparability, all scores from theindices are further truncated up towards the nearest integer.The first is the Flesch–Kincaid index [30]. It is defined as: g = (cid:24) . (cid:18) (cid:19) + 11 . (cid:18) (cid:19) − . (cid:25) , where g refers to a grade. The second is the SMOG index orthe “simple measure of gobbledygook” [31]. It is given by: g = (cid:38) . (cid:115) (cid:18) (cid:19) + 3 . (cid:39) . The third is ARI, the automated readability index [29], [32]: g = (cid:24) . (cid:18) (cid:19) + 0 . (cid:18) (cid:19) − . (cid:25) . The fourth is the Coleman-Liau index [33], as defined by: g = (cid:100) . α − . β − . (cid:101) , where α is the number of letters per a hundred words and β the average number of sentences per 100 words. The fifthand final index is the Linsear Write readability formula. It isdefined differently from the other four indices. In essence: foreach 100 word sample, easy (with two syllables or less) andhard (three or more syllables) are counted and scored (withone and three points, respectively), after which the per-sample ecision 2004/915/ECDecision 2002/622/ECDecision 2001/497/ECDecision 1247/2002/ECDecision 11 June 2019Decision (EU) 2020/636Decision (EU) 2020/590Decision (EU) 2020/1426Decision (EU) 2019/785Decision (EU) 2019/784Decision (EU) 2019/236Decision (EU) 2019/235Decision (EU) 2019/165Decision (EU) 2019/154Decision (EU) 2018/661Decision (EU) 2018/637Decision (EU) 2018/254Decision (EU) 2018/1996Decision (EU) 2018/1962Decision (EU) 2018/1961Decision (EU) 2018/1927Decision (EU) 2018/1538Decision (EU) 2017/899Decision (EU) 2017/2077Decision (EU) 2017/191Decision (EU) 2017/1483Decision (EU) 2016/687Decision (EU) 2016/339Decision (EU) 2016/2317Decision (EU) 2015/750Decision (EU) 2015/1293Decision (EU, Euratom) 2017/46Decision (EU, Euratom) 2015/443COM/2017/0725 finalCOM(2016) 179 finalCOM(2016) 117 finalCOM(2015) 680 finalCOM(2015) 192 finalCOM(2014) 72 finalCOM(2014) 442 finalCOM(2014) 228 finalCOM(2013) 634 finalCOM(2013) 627 finalCOM(2012) 596 finalCOM(2011) 878 finalCOM(2009) 277 finalCOM(2009) 111 finalCOM(2008) 593 finalCOM(2003) 198 finalCOM(1998) 476 final Directive (EU) 2019/789Directive (EU) 2019/771Directive (EU) 2019/770Directive (EU) 2019/2161Directive (EU) 2019/1024Directive (EU) 2018/1972Directive (EU) 2018/1808Directive (EU) 2017/1564Directive (EU) 2016/681Directive (EU) 2016/680Directive (EU) 2016/2102Directive (EU) 2016/1148Directive (EU) 2015/1535Decision 91/287/EECDecision 676/2002/ECDecision 672/2002/ECDecision 626/2008/ECDecision 243/2012/EUDecision 2014/641/EUDecision 2014/276/EUDecision 2014/243/EUDecision 2014/221/EUDecision 2013/752/EUDecision 2013/743/EUDecision 2013/654/EUDecision 2013/504/EUDecision 2013/275/EUDecision 2013/195/EUDecision 2012/688/EUDecision 2012/472/EUDecision 2012/471/EUDecision 2011/829/EUDecision 2011/667/EUDecision 2011/485/EUDecision 2011/251/EUDecision 2010/87/EUDecision 2010/368/EUDecision 2010/267/EUDecision 2010/166/EUDecision 2009/766/ECDecision 2009/449/ECDecision 2009/381/ECDecision 2008/477/ECDecision 2008/411/ECDecision 2007/98/ECDecision 2007/90/ECDecision 2007/344/ECDecision 2006/771/ECDecision 2005/513/ECDecision 2005/222/JHA Directive 96/9/ECDirective 95/46/ECDirective 93/98/EECDirective 93/83/EECDirective 93/13/EECDirective 91/250/EECDirective 88/301/EECDirective 87/372/EECDirective 2016/943Directive 2014/61/EUDirective 2014/53/EUDirective 2014/35/EUDirective 2014/30/EUDirective 2014/26/EUDirective 2014/25/EUDirective 2014/24/EUDirective 2014/23/EUDirective 2013/98/ECDirective 2013/40/EUDirective 2013/37/EUDirective 2012/28/EUDirective 2011/83/EUDirective 2011/77/EUDirective 2010/13/EUDirective 2009/81/ECDirective 2009/43/ECDirective 2009/24/ECDirective 2009/140/ECDirective 2009/136/ECDirective 2009/114/ECDirective 2008/63/ECDirective 2008/52/ECDirective 2008/294/ECDirective 2006/95/ECDirective 2006/24/ECDirective 2006/116/ECDirective 2006/115/ECDirective 2004/48/ECDirective 2002/77/ECDirective 2002/58/ECDirective 2002/22/ECDirective 2002/21/ECDirective 2002/20/ECDirective 2002/19/ECDirective 2001/84/ECDirective 2001/29/ECDirective 2000/31/ECDirective 1999/93/ECDirective 1999/5/ECDirective (EU) 2019/790 SWD(2016) 308 finalSWD(2015) 100 finalRegulation (EU) 910/2014Regulation (EU) 611/2013Regulation (EU) 593/2008Regulation (EU) 531/2012Regulation (EU) 386/2012Regulation (EU) 2020/857Regulation (EU) 2019/881Regulation (EU) 2019/517Regulation (EU) 2019/1150Regulation (EU) 2018/1971Regulation (EU) 2018/1807Regulation (EU) 2018/1727Regulation (EU) 2018/1725Regulation (EU) 2018/1488Regulation (EU) 2018/1241Regulation (EU) 2017/920Regulation (EU) 2017/1939Regulation (EU) 2017/1563Regulation (EU) 2017/1128Regulation (EU) 2017/1001Regulation (EU) 2016/794Regulation (EU) 2016/679Regulation (EU) 2015/758Regulation (EU) 2015/2120Regulation (EU) 2015/1986Regulation (EU) 1291/2013Regulation (EU) 1215/2012Regulation (EEC) 3577/92Regulation (EC) 874/2004Regulation (EC) 733/2002Regulation (EC) 717/2007Regulation (EC) 544/2009Regulation (EC) 45/2001Regulation (EC) 2006/2004Regulation (EC) 1370/2007Regulation (EC) 1211/2009Regulation (EC) 1008/2008Recommendation 2014/478/EURecommendation 2013/466/EURecommendation 2013/105/ECRecommendation 2010/572/EURecommendation 2009/396/ECRecommendation 2008/850/ECRecommendation 2008/295/ECRecommendation 2007/879/ECRecommendation (EU) 2020/1307JOIN(2013) 1 finalDirective 98/84/ECDirective 98/34/EC Fig. 1. Legal and Policy Documents Included in the Dataset ( n = 201 ) G r ade Flesch−Kincaid
Median = 28 G r ade SMOG
Median = 25 G r ade ARI
Median = 36 G r ade Coleman−Liau
Median = 16 G r ade Linsear
Median = 72
Fig. 2. Readability Grades ( g , . . . , g , n = 201 ) cores are divided by the number of sentences in the sample,and further scaled (for more details see [29]). As with the otherindices, a truncated grade, g , is outputted from the arithmetic.Two additional points are worth briefly making about theseclassical indices. First, there are many modifications to these,as well as numerous alternatives. Hundreds of individualvariables were considered already in the 1970s for construct-ing readability indices [34]. More recent modifications havefocused on natural language processing and machine learningmethods [35], [36]. The second point follows: all readabilityindices have always been heavily criticized. The criticism andthus limitations are later on briefly discussed in the concludingSection VI. For the present purposes, it suffices to justifythe use of the five readability indices with an argument thatmodifications and more elegant methods seem uncalled for inthe present application domain. As there is no prior empiricalwork in the domain, the five indices serve to make a baseline.V. R ESULTS
The grades across the documents are shown in Fig. 2for each of the five readability indices. There are three pointsto make from the figure. First, the Linsear Write index failsa commonsense validity test. Although the grades from theindex show a fairly large variance, the median of is not arealistic value in practice. Second, the small standard deviationof . of the grades from the Coleman-Liau index seemsurprising. Third, the previous two points translate into modestcorrelations with the grades from the remaining three indices.In contrast, the grades from the Flesch-Kincaid, SMOG, andARI indices are highly correlated (see Table I). By implica-tion, a sum variable from these three indices (based on therowwise arithmetic means) attains high internal consistency.For instance, Cronbach’s α -coefficient [37] is as large as . . TABLE IC
ORRELATIONS (P EARSON )1. 2. 3. 4. 5.1. Flesch-Kincaid2. SMOG .
3. ARI .
995 0 .
4. Coleman-Liau − .
111 0 . − .
5. Linsear .
330 0 .
347 0 . − . These observations are enough to answer to the first andsecond research questions. Regarding RQ : there are notabledifferences between the five indices. Regarding RQ : basedon the median of the sum variable, the overall readabilityis somewhere near the :th grade. What does such a grademean; how large is this value? Given that even the first quartileis about , it seems equitable to conclude that a completionof a graduate program is required to comprehend the DSMlaws and policy documents—at least insofar as the quantitativereadability indices convey their intended function. Even iftheir validity is questioned, as can be done on many grounds,there is still a comparative viewpoint supporting the conclusionof overall complexity. For instance, local newspapers in the < − − − − − Frequency G r ade Fig. 3. Readability Grades Across Time (sum variable)
United States have been observed to attain values around at maximum [38]. Another comparative example would beabstracts in psychology papers for which values around have been reported [39]. Finally, the visualization in Fig. 3suffices to answer to RQ . Although the amount of lawsand policy documents has steadily grown over the years, thereadability of these have slightly improved when comparedto the 1980s and 1990s. A potential explanation may relateto incremental law-making; many of the new laws enactedamend or replace old laws, building upon their foundations.Nevertheless, the averages have still remained around or so.VI. C ONCLUSION
This short paper evaluated the readability of laws and policydocuments related to the digital single market of the EU. Ingeneral, these are difficult to comprehend according to thequantitative readability indices. Even though readability hasslightly improved, the hypothetical grade level is still aroundthirty. What could explain this result? A partial explanationwould be that the DSM laws and policies, in particular, simplyare complex; many of them address highly technical topics.Another partial explanation stems from the EU’s law-makingprocesses. After a law is finally enacted, it has gone througha heavy process of revisions, often including copy-pastedsnippets from politicians pressed by lobbyists [40]. Whilecopy-pasting is not unique to the EU [41], it is likely toincrease gobbledygook. Another source relates to Regulation1/1958 according to which multiple European languages mustbe accounted for already during the law-making processes.A further partial explanation originates from the well-knownlimitations of the quantitative readability indices. In particular,the construct validity of these is often questionable; it isnot entirely clear whether they measure what they intendto measure. It is not necessary to elaborate the lengthycriticism in detail. It suffices to note that the indices ignoregrammar [15], and do not address the fact that simplicityby itself does not necessarily guarantee comprehension [42].Applications of the newer machine learning methods to theomain of law would be a good topic for further research. Aslightly better topic would be to examine the comprehension oflaws and policy documents by human subjects. A particularlyinteresting question is the comprehension—or, possibly, lackthereof—among engineering and computer science students.The question is relevant because it has been computer sciencethat has been seen to enhance law and its practice, often withquestionable consequences [43], and not the other way around.R
EFERENCES[1] OECD,
PISA 2018 Results: What Students Know and Can Do, Volume I .Paris: OECD Publishing, 2019.[2] L. Verheijen, W. Spooren, and A. van Kemenade, “Relationships Be-tween Dutch Youths’ Social Media Use and School Writing,”
Computersand Composition , vol. 56, p. 102574, 2020.[3] M. Abdulwahed, W. Balid, M. O. Hasna, and S. Pokharel, “Skills ofEngineers in Knowledge Based Economies: A Comprehensive LiteratureReview, and Model Development,” in
Proceedings of IEEE InternationalConference on Teaching, Assessment and Learning for Engineering(TALE 2013) . Bali: IEEE, 2013, pp. 759–765.[4] Google, Inc., “Technical Writing Courses,” 2021, Available online inJanuary 2021: https://developers.google.com/tech-writing.[5] M. W. Conley and A. Wise, “Comprehension for What? PreparingStudents for Their Meaningful Future,”
Theory Into Practice , vol. 50,pp. 93–99, 2011.[6] M. Hildebrandt,
Law for Computer Scientists and Other Folk . Oxford:Oxford University Press, 2020.[7] K. Hjerppe, J. Ruohonen, and V. Lepp¨anen, “The General Data Pro-tection Regulation: Requirements, Architectures, and Constraints,” in
Proceedings of the 27th IEEE International Requirements EngineeringConference (RE 2019) . Jeju Island: IEEE, 2019, pp. 265–275.[8] L. Mommers, W. Voermans, W. Koelewijn, and H. Kielman, “Under-standing the Law: Improving Legal Knowledge Dissemination by Trans-lating the Contents of Formal Sources of Law,”
Artificial Intelligence andLaw , vol. 17, pp. 51–78, 2009.[9] D. D. Dyson and K. Schellenberg, “Access to Justice: The Readabilityof Legal Services Corporation Legal Aid Internet Services,”
Journal ofPoverty , vol. 21, no. 2, pp. 142–165, 2016.[10] B. A. Rutherford, “Obfuscation, Textual Complexity and the Role ofRegulated Narrative Accounting Disclosure in Corporate Governance,”
Journal of Management and Governance , vol. 7, pp. 187–210, 2003.[11] H. Frooman, “Lawyers and Readability,”
The Journal of BusinessCommunication , vol. 18, no. 4, pp. 45–51, 1973.[12] U. Karpen, “Instructions for Law Drafting,”
European Journal of LawReform , vol. 10, no. 2, pp. 163–182, 2008.[13] R. Flesch,
How to Write, Speak, and Think More Effectively . NewYork: Harper & Brothers, 1960 [1946].[14] G. Orwell,
Politics and the English Language and Other Essays . Publicdomain 2018 edition, 1946.[15] G. M. McClure, “Readability Formulas: Useful or Useless?”
IEEETransactions on Professional Communication , vol. PC-30, no. 1, pp.12–15, 1987.[16] S. Zhou, H. Jeong, and P. A. Green, “How Consistent Are the Best-Known Readability Equations in Estimating the Readability of DesignStandards?”
IEEE Transactions on Professional Communication , vol. 60,no. 1, pp. 97–111, 2017.[17] Y. Sun, X. Wang, and Y. Yu, “Readability to Financial Report: AComparative Study of Chinese and Foreign Countries,” in
Proceedings ofthe Seventh International Joint Conference on Computational Sciencesand Optimization (CSO 2014) . Beijing: IEEE, 2014, pp. 69–73.[18] E. K. Leong, M. T. Ewing, and L. F. Pitt, “E-Comprehension: Evalu-ating B2B Websites Using Readability Formulae,”
Industrial MarketingManagement , vol. 31, pp. 125–131, 2020.[19] B. Fabian, T. Ermakova, and T. Lentz, “Large-Scale Readability Analysisof Privacy Policies,” in
Proceedings of the International Conference onWeb Intelligence (WI 2017) . Leipzig: ACM, 2017, pp. 18–25.[20] O. Kayam, “The Readability and Simplicity of Donald Trump’s Lan-guage,”
Political Studies Review , vol. 16, no. 1, pp. 73–78, 2017. [21] X. Wang, X. Zhang, C. Jiang, and H. Liu, “Identification of FakeReviews Using Semantic and Behavioral Features,” in
Proceedings ofthe 4th International Conference on Information Management (ICIM2018) . Oxford: IEEE, 2018, pp. 92–97.[22] F. Santel, I. Bah, K. Kim, J.-A. Lin, J. McCracken, and A. Teme,“Assessing Readability and Comprehension of Informed Consent Ma-terials for Medical Device Research: A Survey of Informed Consentsfrom FDA’s Center for Devices and Radiological Health,”
ContemporaryClinical Trials , p. 105831, 2019.[23] R. N. Arkell and H. C. van Dyck, “Readability of Human RightsMaterial,”
Canadian Community Law Journal – Revue Canadienne deDroit , vol. 2, pp. 12–14, 1978.[24] L. M. Tan and G. Tower, “The Readability of Tax Laws: An EmpiricalStudy in New Zealand,”
Australian Tax Forum , vol. 9, no. 3, pp. 355–372, 1992.[25] R. Small, J. Platania, and B. Cutler, “Assessing the Readability of CapitalPattern Jury Instructions,”
Jury Expert , vol. 25, no. 1, pp. 18–22, 2013.[26] R. Whalen, “Judicial Gobbledygook: The Readability of Supreme CourtWriting,”
Yale Law Journal Forum , vol. 125, pp. 200–211, 2015.[27] D. R. Ploch, B. K. Dumas, G. B. Gray, B. J. MacLennan, and J. E.Nolt, “Readability of the Law: Forms of Law for Building Legal ExpertSystems,”
Jurimetrics , vol. 33, no. 2, pp. 189–222, 1993.[28] European Union, “Digital Single Market,” 2021, Summaries of EULegislation. Available online in February 2021: https://eur-lex.europa.eu/summary/chapter/31.html.[29] S. Bansal and C. Aggarwal, “Textstat,” 2021, Version 0.7.0. Availableonline in January: https://pypi.org/project/textstat/.[30] J. P. Kincaid, R. P. Fishburne, R. L. Rogers, and B. Chissom, “Derivationof New Readability Formulas (Automated Readability Index, Fog Countand Flesch Reading Eease Formula) for Navy Enlisted Personnel,”1975, Naval Technical Training Command, Research Branch Report8-75. Available online in February 2021: https://apps.dtic.mil/sti/pdfs/ADA006655.pdf.[31] G. H. McLaughlin, “SMOG Grading – a New Readability Formula,”
Journal of Reading , vol. 12, no. 8, pp. 639–646, 1969.[32] E. A. Smith and J. P. Kincaid, “Derivation and Validation of theAutomated Readability Index for Use with Technical Materials,”
HumanFactors , vol. 12, no. 5, pp. 457–564, 1970.[33] M. Coleman and T. L. Liau, “A Computer Readability Formula Designedfor Machine Scoring,”
Journal of Applied Psychology , vol. 60, pp.283–284, 1975.[34] E. B. Entin and G. R. Klare, “Factor Analyses of Three Correlation Ma-trices of Readability Variables,”
Journal of Reading Behavior , vol. 10,no. 3, pp. 279–290, 1979.[35] T. Franc¸ois and E. Miltsakaki, “Do NLP and Machine Learning ImproveTraditional Readability Formulas?” in
Proceedings of the First Workshopon Predicting and Improving Text Readability for Target Reader Popu-lations (PITR 2012) . Montreal: ACM, 2012, pp. 49–57.[36] L. C. R. Timan´a, D. F. S. Lozano, and J. F. C. Garc´ıa, “Softwareto Determine the Readability of Written Documents by Implementinga Variation of the Gunning Fog Index Using the Google LinguisticCorpus,” in
Proceedings of the International Conference on AppliedTechnologies (ICAT 2019) . Quito: Springer, 2020, pp. 409–420.[37] L. J. Cronbach, “Coefficient Alpha and the Internal Structure of Tests,”
Psychometrika , vol. 16, no. 3, pp. 297–334, 1951.[38] B. Wasike, “Preaching to the Choir? An Analysis of Newspaper Read-ability vis-a-vis Public Literacy,”
Journalism , vol. 19, no. 11, pp. 1570–1587, 2016.[39] J. Stricker, A. Chasiotis, M. Kerwer, and A. G¨unther, “Scientific Ab-stracts and Plain Language Summaries in Psychology: A ComparisonBased on Readability Indices,”
PLoS ONE , vol. 15, no. 4, p. e0231160,2020.[40] J. Ruohonen, “David and Goliath: Privacy Lobbying in the EuropeanUnion,” 2019, Archived manuscript, available online: arXiv:1906.01883.[41] T. Allee and M. Elsig, “Are the Contents of International TreatiesCopied and Pasted? Evidence from Preferential Trade Agreements,”
International Studies Quarterly , vol. 63, no. 3, pp. 603–613, 2019.[42] J. Korunovska, B. Kamleitner, and S. Spiekermann, “The Challenges andImpact of Privacy Policy Comprehension,” in
Proceedings of the Twenty-Eight European Conference on Information Systems (ECIS 2020) . AIS,2020, pp. 1–17.[43] N. S. Goltz and G. Dondoli, “A Note on Science, Legal Research andArtificial Intelligence,”