[PDF] Speculative Analysis for Quality Assessment of Code Comments

Abstract

Previous studies have shown that high-quality code comments assist developers in program comprehension and maintenance tasks. However, the semi-structured nature of comments, unclear conventions for writing good comments, and the lack of quality assessment tools for all aspects of comments make their evaluation and maintenance a non-trivial problem. To achieve high-quality comments, we need a deeper understanding of code comment characteristics and the practices developers follow. In this thesis, we approach the problem of assessing comment quality from three different perspectives: what developers ask about commenting practices, what they write in comments, and how researchers support them in assessing comment quality. Our preliminary findings show that developers embed various kinds of information in class comments across programming languages. Still, they face problems in locating relevant guidelines to write consistent and informative comments, verifying the adherence of their comments to the guidelines, and evaluating the overall state of comment quality. To help developers and researchers in building comment quality assessment tools, we provide: (i) an empirically validated taxonomy of comment convention-related questions from various community forums, (ii) an empirically validated taxonomy of comment information types from various programming languages, (iii) a language-independent approach to automatically identify the information types, and (iv) a comment quality taxonomy prepared from a systematic literature review.

Full PDF

SSpeculative Analysis for Quality Assessment ofCode Comments

Pooja Rani

Software Composition Group, University of BernBern, Switzerland (cid:140) scg.unibe.ch/staff

Abstract —Previous studies have shown that high-quality codecomments assist developers in program comprehension and main-tenance tasks. However, the semi-structured nature of comments,unclear conventions for writing good comments, and the lack ofquality assessment tools for all aspects of comments make theirevaluation and maintenance a non-trivial problem. To achievehigh-quality comments, we need a deeper understanding of codecomment characteristics and the practices developers follow.In this thesis, we approach the problem of assessing commentquality from three different perspectives: what developers askabout commenting practices, what they write in comments, andhow researchers support them in assessing comment quality.Our preliminary ﬁndings show that developers embed variouskinds of information in class comments across programminglanguages. Still, they face problems in locating relevant guidelinesto write consistent and informative comments, verifying theadherence of their comments to the guidelines, and evaluatingthe overall state of comment quality. To help developers andresearchers in building comment quality assessment tools, weprovide: (i) an empirically validated taxonomy of commentconvention-related questions from various community forums,(ii) an empirically validated taxonomy of comment informationtypes from various programming languages, (iii) a language-independent approach to automatically identify the informationtypes, and (iv) a comment quality taxonomy prepared from asystematic literature review.

Index Terms —code comments, mining developer sources, de-veloper information needs, comment quality assessment

I. I

NTRODUCTION

Well-documented code facilitates various software devel-opment and maintenance activities [1], [2]. Several studiesshow that high quality code comments help developers inprogram comprehension [3], suitable API selection [4], andbug detection [5]. However, comments are written using nat-ural language sentences and their syntax and semantics areneither enforced by a programming language nor checked bythe compiler. As a result, developers are free to use numerousmeans and conventions to write comments [6], and embedvarious types of information in them [7], thus making thequality evaluation of comments more complicated.To guide developers in writing consistent and informativecomments, programming language communities such as thosefor Java and Python, and large organizations such as Googleand Oracle provide coding style guidelines. However, theseguidelines only marginally cover aspects of commenting codesuch as content, style, and syntax. Furthermore, the availabilityof several guidelines for a language makes developers unsure about which comment conventions to use, which syntax tofollow, and which type of information to write for what kindsof comments. Therefore, developers ask questions on mailinglists, and community platforms such as Stack Overﬂow (SO)and Quora to address these issues [8], [9]. Analyzing suchdeveloper concerns is valuable to understand their needs, andto identify challenges related to commenting practices. Simi-larly analyzing their actual commenting practices is essentialto understand the information they embed in comments andto ensure the quality of that information.Previous studies have characterized developer commentingpractices in OOP languages by classifying comments based onthe information that comments contain [7], [10]–[12]. Giventhe variety of comment types (class, method, or inline), notall comment types describe the source code at same levelsof abstraction, therefore, the quality assessment tools needto be tailored accordingly. For example, class comments inJava should present high-level information about a class,whereas method comments should present implementation-level details [13]. These commenting conventions vary acrossprogramming languages. For instance, in comparison to Java,class comments in Smalltalk are expected to contain high-leveldesign details and low-level implementation details. Giventhe increasing usage of multi-language software systems [14]and persistent concerns about maintaining high documentationquality, it is critical to understand what developers write ina particular comment type, and to build tools to extract andcheck the embedded information across languages. This canalso help to ensure the extent to which a comment type adheresto a coding style guideline from the content aspect.Even when a comment adheres to its coding style guidelinesfrom all aspects such as content, syntax, and style, it is stillpossible that the comment is incomplete or inconsistent withthe code, and thus lacks the desired high quality. There-fore, several other quality attributes that can affect commentquality need to be considered in the overall assessment ofcomments. Researchers have proposed numerous commentquality evaluation models based on a number of metrics [6],[15] and classiﬁcation approaches [11]. However, a unifyingcomment quality taxonomy to express the purposes for whichresearchers evaluate comments, and which quality attributesthey consider important and integrate frequently in their com-ment quality models or tools is still missing.In summary, a good understanding of the existing practices a r X i v : . [ c s . S E ] F e b hat developers follow, and of comment quality models re-searchers suggest is necessary to bridge the gap between thenotion of quality and its concrete implementation. To gainthis required understanding, we analyze code comments fromvarious perspectives, of developers in terms of what theyask and what they write in comments, and of researchers interms of what they suggest. In this exploration of semanticsembedded in the comments, we use speculative analysis , byanalogy with speculative execution ( e.g. , branch prediction).Previous studies have also shown how speculative analysiscan be used to develop tools that inform developers early andprecisely about potential consequences of their actions [16],[17]. In our case, we are interested in supporting developersto ensure comment quality while writing or using commentsfor various development tasks.The goal of this thesis is to investigate practices in codecomment writing and evaluation in a stepwise manner toultimately improve comment quality assessment techniques.I state my thesis as follows: Understanding the specﬁcation of high-quality comments to buildeffective assessement tools requires a multi-perspective view of thecomments. The view can be approached by analyzing (1) developerconcerns about comments, (2) their commenting practices withinIDEs, and (3) required quality attributes for their comments.

This dissertation will focus on three main questions: • What do developers ask about commenting practices?

Answering this can help in (i) identifying the key chal-lenges developers face with current conventions and tools,and (ii) adapting their approaches accordingly. • What information do developers write in comments?

Understanding this can support the development of toolsand approaches (i) to identify important information typesand comment clones automatically, and (ii) to verifythe adherence of comments to style guidelines from thecontent aspect. • How do researchers support comment quality assess-ment?

Answering this question can help in (i) identifyingthe limitation of the existing tools and techniques, and (ii)adapting the tools according to the proposed commentquality taxonomy.II. T

HESIS V ISION

This section presents the studies conducted to answer themain questions, as shown in Figure 1. The following sub-sections brieﬂy describe the motivation, methodology, andpreliminary ﬁndings of each study.

A. What do developers ask about commenting practices?

Previous studies have leveraged various online platforms togain a deep understanding of developers needs and challenges[8], [9], [18]. We investigated the popular Q&A forums,SO and Quora, and Apache project-speciﬁc mailing lists tounderstand their commenting practices.

Methodology . To answer the question, we mined and pre-processed 11 931 posts extracted using 14 relevant tags onSO. The tags are selected using a hybrid approach combining

Speculative Analysis of Code Comments

What do developers ask aboutcommenting practices?

SRQ : What types of questions andproblems developers discussregarding comment conventions oncommunity platforms?

Relevant topics Challenges Tool

What information do developers writein the comments?

SRQ : What types of information ispresent in the class comments? Towhat extent do information types varyacross programming languages?

Class Comment TypeModel (CCTM) Machine learning

How do researchers supportassessing comment quality?

SRQ : Which quality attributes andmetrics are commonly used toassess code comment quality?

MetricsQualityattributes RelevantLiterature

Implications for researchers and developers to improve comment quality

Fig. 1. Overview of my dissertation with all research questions, theirmethodology, and results. a heuristics-based approach used by Yang et al. [9] and akeyword-based approach used by Aghajani et al. successfullyin their work [18]. We used a semi-automated approach basedon Latent Dirichlet Allocation (LDA) [19], an advanced andpopular topic modeling technique, to identify topics fromthe selected posts. To uncover developer concerns in detail(mainly which type of questions they ask and about whattools and techniques), we manually analyzed a statisticallysigniﬁcant sample set of posts from SO, Quora and mailinglists, and formulated a taxonomy of these concerns. The tax-onomy offers an overview of the leading questions discussingcommenting conventions in a more formal, structured, andpossibly exhaustive way.

Findings . Our study results highlight that: (i) Developersask questions about best practices to write comments (15%of the questions) and generate comments automatically usingvarious tools and technologies. (ii) Among 14 topics iden-tiﬁed by LDA, we found ﬁve irrelevant topics due to thegenerality and commonality of the tags ( e.g. , “convention,”,“commenting”). (iii) From our manual analysis, we foundthat developers are interested in embedding various kinds ofinformation, such as code examples and media ( e.g. , images)in their code comments but lack clear guidelines to write them.(iv) Developers post questions about documentation tools onSO, whereas no such questions are reported on Quora. Inmailing lists, we did not ﬁnd enough developer discussionsabout comment conventions.

Conclusion . This analysis shows that developers use variouscommunity platforms to raise concerns about code comments.Such concerns hint at the challenges developers face, and theirneeds from the programming language communities, tools,technologies, and researchers. Conveying clear guidelines towrite good comments, and building tools to verify the ad-herence of comments to these guidelines indicate possibledirections to support developers.

B. What information do developers write in comments?

Source code comments consist of several comment types(class comments, method comments, inline comments), but notall comment types contain the same types of information. Wetart our analysis by ﬁrst focusing on class comments, whichplay an important role in obtaining a high-level overviewof classes in object-oriented programming languages [20].Class commenting practices however vary across programminglanguages. For instance, a class comment in Java or Pythoncontains high-level overview details and uses annotations ( e.g. ,@param,@return) to express speciﬁc types of information. Incontrast, class comments in Smalltalk contain detailed designand implementation documentation, and they do not make useof any annotation. We ﬁrst investigated class comments inPharo (a modern Smalltalk environment), and identiﬁed thetypes of information developers embed in them by studying theresearch question RQ : What types of information are presentin Pharo class comments?

Then we measured the adherenceof Pharo class comments to the class comment guidelines.To generalize our ﬁndings across languages, we extendedour analysis to other programming languages, namely Javaand Python. We systematically compared the commonalitiesand differences among class commenting practices with theresearch question RQ : To what extent do information typesvary across programming languages?

In order to automatethe identiﬁcation of information types from class commentsacross languages, we studied the research question RQ : Canmachine learning be used to identify class comment typesaccording to our taxonomy automatically?

Methodology . To answer RQ , we conducted a three-iteration-based analysis on a statistically signiﬁcant sample setof 714 comments selected from internal and external projectsof Pharo. Three authors analyzed the content of commentsusing open-card sorting and pair sorting to build and validatethe comment taxonomy. In the case of Python and Java classcomments, we used the initial comment taxonomy availablefrom previous works [7], [12], and analyzed and validatedthe content using the closed-card sorting technique. Based onthe constructed taxonomy i.e. , Class Comment Type Model(CCTM) and labelled data from each language, we answeredRQ . To automatically classify class comment types accordingto CCTM for RQ , we used an approach that leverages twotechniques — namely Natural Language Processing (NLP)and TF-IDF. We use the TF-IDF technique as a baseline dueto its successful adoption in recent work on classifying codecomments [21]. We transform a multi-label classiﬁcation into aset of single-label classiﬁcation problems to balance one labelat a time and avoid over-ﬁtting the categories. We adopt a10-fold cross-validation strategy with a standard probabilisticNaive Bayes classiﬁer, the J48 tree model, and the RandomForest model based on the recent work [7]. We evaluate RQ by measuring precision, recall, and F-measure of our approachagainst the TF-IDF baseline approach. Findings . Our results highlight that: (i) Developers expressdifferent kinds of information (more than 15 informationtypes) in class comments ranging from the high-level overviewof the class to low-level implementation details across pro-gramming languages. (ii) Class comments contain varioustypes of information but not all of these information typesare suggested by coding guidelines, and this behaviour is observed across all the selected languages. In the case ofPharo, the information types suggested by the guidelines wereobserved more frequently than other information types. Weare in still in the process of verifying this observation in otherprogramming languages. (iii) The Random Forest algorithmfed by the combination of NLP+TF-IDF features achievesthe best classiﬁcation performance for the top six frequentcategories over the investigated languages with relatively highprecision (ranging from 78% to 92% for the selected lan-guages), recall (ranging from 86% to 92%), and F-Measure(ranging from 77% to 92%) where Pharo achieves less stableresults compared to Python and Java.

Conclusion.

This analysis highlights the diverse types ofinformation developers embed in class comments regardlessof coding style guidelines about comments. Given the beneﬁtsof retrieving these information types automatically for variousdevelopment tasks, it highlights the challenges in unifyingretrieval approaches across languages.

C. How do researchers support comment quality assessment?

Software quality is frequently represented as a contextualconcept. Therefore, it requires identiﬁcation and quantiﬁcationof important characteristics of high-quality software as a ﬁrststep to measure it [22]. The main objective of our literaturereview is: to identify the quality attributes that are used toassess code comment quality and collect the metrics used tomeasure these quality attributes. Additionally, we are inter-ested in which tools/models have been proposed by researchersto assess comment quality. To achieve these objectives, weplan to conduct a systematic literature review (SLR) to answerthe following research questions. RQ : What quality attributesare used to evaluate the quality of code comments, and whatmetrics are used to estimate the quality attributes? RQ : Which quality attributes and metrics do current assessmenttools support?

Methodology : We plan to conduct the SLR following theguidelines of Kitchenham [23]. We separate the study stepsassociated with the SLR-related phases planning, conductingthe review, and reporting. In the planning phase, we identifythe objectives of our SLR and specify the research questions.We plan to review the proceedings of the past ten years i.e. ,2010-2020 from the relevant SE conferences and journals ac-cording to the

Computing Research and Education Associationof Australasia (CORE) ranking. We formulate the inclusionand exclusion criteria. Based on these criteria, we plan tosystematically identify relevant studies.

Expected output . Insights from the SLR are expected toprovide a detailed view of the tools and techniques proposedby researchers to assess the quality of comments. Based onthese insights, we plan to prepare a comment quality taxonomywhich can help researchers and developers in identifyingvarious quality attributes suitable for a comment type andintegrating the relevant measures in their tools as per theirrequirements. CORE rankings portal

II. P

RELIMINARY AND E XPECTED C ONTRIBUTIONS

From each study, we present empirical insights, approaches,and tools to support developers and researchers in ensuringhigh-quality comments. • For the ﬁrst question “What do developers ask aboutcommenting practices?” , we present: (i) an empiricallyvalidated taxonomy of comment convention-related ques-tions from various community forums, and (ii) a tool toconduct a mining study on multiple sources or forums, • For the second question “What information do developerswrite in comments?” , we provide: (i) an overview of thePharo class commenting trends over seven major releasestill 2019, (ii) an empirically validated taxonomy, calledCCTM, characterizing the information types found inclass comments written by developers in three differentprogramming languages, and (iii) an automated approach(available for research purposes) able to accurately clas-sify class comments according to CCTM • For the third question “How do researchers supportcomment quality assessment?” , we expect to achieve:(i) a comment quality taxonomy to identify relevantquality attributes, and (ii) a review of existing tools andtechniques that assess the quality of code comments.IV. P

ROPOSED T IMELINE

I am a third-year PhD student and will be entering the ﬁnalyear of my PhD from January 2021. The expected timelinefor the projects: • A ﬁrst study (II-A) has been submitted to the journal-ﬁrst track at Transactions on Software Engineering andMethodology, 2020 (TOSEM’20). • The RQ1 in the second study (II-B) is currently undergo-ing a minor revision in the journal-ﬁrst track at EmpiricalSoftware Engineering (EMSE’19) [24]. • Other research questions (RQ2 and RQ3) in the secondstudy (II-B) have been submitted to the Journal of Sys-tems and Software (JSS’20). • The third study (II-C) is currently planned to be submittedto Transactions on Software Engineering (TSE’21).V. R

ELATED WORK

Comment conventions (RQ1) : Developers frequently usevarious web resources to satisfy their information needs.Recently, researchers have started leveraging these resourcessuch as version control systems [25], Q&A forums [8], [9], andmailing lists [18]. In the context of software documentation,Aghajani et al. studied documentation issues on SO, Githuband mailing lists [18] and formulated a taxonomy of theseissues. However, they have focused on the issues relatedto project documentation, such as wikis, user manuals, andcode documentation, and do not focus speciﬁcally on theissues of the convention of the code comments. Barua etal. found questions concerning coding style and practice tobe amongst those most frequently appearing on SO [8], butdid not investigate it further. Our ﬁrst question (II-A) focuses speciﬁcally on the problems related to commenting practicesdevelopers discuss on SO, Quora, and mailing lists.

Identify information types from comments (RQ2) : Codecomments contain valuable information to help developers invarious activities and tasks. Pascarella et al. identiﬁed theinformation types from Java code comments and presenteda taxonomy [7]. Similarly, Zhang et al. identiﬁed informationtypes from Python code comments [12]. We focused speciﬁ-cally on class commenting practices and used their taxonomyas an initial taxonomy to classify class comments. Comparedto the work of Pascarella et al. and Zhang et al. [7], [12],we found several other types of information such as warnings,observations, and recommendations developers embed in classcomments. To identify different kinds of information fromcomments automatically, several studies have explored numer-ous approaches based on heuristics or textual features [26],[27]. In contrast to these previous approaches, we extractedthe natural language patterns (heuristics) automatically usinga tool, combined them with other textual features, and testedour approach across languages.

Comments quality (RQ3) : Apart from identifying in-formation embedded in the comments, assessing commentsfrom other perspectives has gained a lot of attention fromresearchers in the past years, for example, assessing commentquality [11], [15], detecting inconsistency between code andcomments [28], [29], and examining co-evolution of code andcomments [30]. The main aim is to keep comments consistentwith the code and to maintain their high quality. Several recentworks have proposed tools and techniques to automaticallyassess the comments using speciﬁc quality attributes and met-rics [11], [15], [31]. However, a unifying model of commentquality attributes and metrics that are considered importantfor assessing comments is still missing. Previous literaturereviews have provided the quality models for the softwaredocumentation [32], [33] but we focus speciﬁcally on the codecomment aspect. VI. C

ONCLUSION

To improve the state of comment quality assessment tech-niques, this thesis focuses on three main questions: what dodevelopers ask about commenting practices, what do theywrite in comments, and how do researchers support assessmentof comments. Our work draws insights from both empiricalevidence mined from developer sources and research results(SLR). Our preliminary ﬁndings show that developers embedvarious kinds of information in comments. Still, they faceseveral problems in locating the speciﬁc comment guidelines,verifying the adherence of their comments to the coding stan-dards, and evaluating the overall state of the comment quality.Our empirical evidence also shows that Pharo developers fol-low commenting guidelines in writing class comments. Theseinsights of developer commenting practices across languagescan help researchers to improve comment quality assessmenttools, and to evaluate comment summarization and commentgeneration approaches. We present initial approaches, tools,nd labelled dataset to facilitate the future comment analysiswork on other languages and environment.My future work will concentrate on exploring which tasksand activities require developers to search these informationtypes in comments and how developers ﬁnd these informationtypes. Based on developer commenting practices, my objectivewould be to improve my prototype tools and reduce the effortsin assessing comment quality. I expect to ﬁnish the work formy dissertation in 2021.R

EFERENCES[1] S. C. B. de Souza, N. Anquetil, and K. M. de Oliveira, “A study of thedocumentation essential to software maintenance,” in

Proceedings ofthe 23rd annual international conference on Design of communication:documenting & designing for pervasive information , ser. SIGDOC ’05.New York, NY, USA: ACM, 2005, pp. 68–75.[2] F. A. Cioch, M. Palazzolo, and S. Lohrer, “A documentation suite formaintenance programmers,” in

Proceedings of the 1996 InternationalConference on Software Maintenance , ser. ICSM ’96. Washington,DC, USA: IEEE Computer Society, 1996, pp. 286–295. [Online].Available: http://dl.acm.org/citation.cfm?id=645544.655870[3] U. Dekel and J. D. Herbsleb, “Reading the documentation of invokedAPI functions in program comprehension,” in . IEEE, 2009, pp.168–177.[4] C. McMillan, D. Poshyvanyk, and M. Grechanik, “Recommendingsource code examples via API call usages and documentation,” in

Proceedings of the 2nd International Workshop on RecommendationSystems for Software Engineering , 2010, pp. 21–25.[5] L. Tan, D. Yuan, G. Krishna, and Y. Zhou, “/* iComment: Bugs or badcomments?*/,” in

Proceedings of twenty-ﬁrst ACM SIGOPS symposiumon Operating systems principles , 2007, pp. 145–158.[6] Y. Padioleau, L. Tan, and Y. Zhou, “Listening to programmers —taxonomies and characteristics of comments in operating system code,”in

Proceedings of the 31st International Conference on Software Engi-neering . IEEE Computer Society, 2009, pp. 331–341.[7] L. Pascarella and A. Bacchelli, “Classifying code comments inJava open-source software systems,” in

Proceedings of the 14thInternational Conference on Mining Software Repositories , ser.MSR ’17. IEEE Press, 2017, pp. 227–237. [Online]. Available:https://doi.org/10.1109/MSR.2017.63[8] A. Barua, S. W. Thomas, and A. E. Hassan, “What are developerstalking about? An analysis of topics and trends in Stack Overﬂow,”

Empirical Software Engineering , vol. 19, no. 3, pp. 619–654, 2014.[Online]. Available: https://doi.org/10.1007/s10664-012-9231-y[9] X.-L. Yang, D. Lo, X. Xia, Z.-Y. Wan, and J.-L. Sun, “Whatsecurity questions do developers ask? A large-scale study of StackOverﬂow posts,”

Journal of Computer Science and Technology ,vol. 31, no. 5, pp. 910–924, 2016. [Online]. Available: https://doi.org/10.1007/s11390-016-1672-0[10] D. Haouari, H. Sahraoui, and P. Langlais, “How good is your comment?A study of comments in Java programs,” in . IEEE,2011, pp. 137–146.[11] D. Steidl, B. Hummel, and E. Juergens, “Quality analysis of sourcecode comments,” in

Program Comprehension (ICPC), 2013 IEEE 21stInternational Conference on . IEEE, 2013, pp. 83–92.[12] J. Zhang, L. Xu, and Y. Li, “Classifying Python code comments basedon supervised learning,” in

International Conference on Web InformationSystems and Applications . Springer, 2018, pp. 39–47.[13] E. Nurvitadhi, W. W. Leung, and C. Cook, “Do class comments aid Javaprogram understanding?” in , vol. 1. IEEE, 2003, pp. T3C–T3C.[14] F. Tomassetti and M. Torchiano, “An empirical assessment of polyglot-ism in GitHub,” in

Proceedings of the 18th International Conference onEvaluation and Assessment in Software Engineering , 2014, pp. 1–4.[15] N. Khamis, R. Witte, and J. Rilling, “Automatic quality assessment ofsource code comments: the JavadocMiner,” in

International Conferenceon Application of Natural Language to Information Systems . Springer,2010, pp. 68–79. [16] Y. Brun, R. Holmes, M. D. Ernst, and D. Notkin, “Speculative analysis:Exploring future development states of software,” in

Proceedings of theFSE/SDP Workshop on Future of Software Engineering Research , ser.FoSER ’10. New York, NY, USA: ACM, 2010, pp. 59–64. [Online].Available: http://doi.acm.org/10.1145/1882362.1882375[17] K. Mus¸lu, Y. Brun, M. D. Ernst, and D. Notkin, “Making ofﬂineanalyses continuous,” in

Proceedings of the 2013 9th Joint Meeting onFoundations of Software Engineering , 2013, pp. 323–333.[18] E. Aghajani, C. Nagy, O. L. Vega-M´arquez, M. Linares-V´asquez,L. Moreno, G. Bavota, and M. Lanza, “Software documentationissues unveiled,” in

Proceedings of the 41st International Conferenceon Software Engineering, ICSE 2019, Montreal, QC, Canada,May 25-31, 2019 , J. M. Atlee, T. Bultan, and J. Whittle,Eds. IEEE / ACM, 2019, pp. 1199–1210. [Online]. Available:https://doi.org/10.1109/ICSE.2019.00122[19] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,”

Journal of machine Learning research , vol. 3, no. Jan, pp. 993–1022,2003.[20] A. Cline, “Testing thread,” in

Agile Development in the Real World .Springer, 2015, pp. 221–252.[21] V. Misra, J. S. K. Reddy, and S. Chimalakonda, “Is there a correlationbetween code comments and issues?: an exploratory study,” in

SAC’20: The 35th ACM/SIGAPP Symposium on Applied Computing, onlineevent, [Brno, Czech Republic], March 30 - April 3, 2020 , 2020, pp.110–117. [Online]. Available: https://doi.org/10.1145/3341105.3374009[22] J. Bansiya and C. Davis, “A hierarchical model for object-orienteddesign quality assessment,”

IEEE Transactions on Software Engineering ,vol. 28, no. 1, pp. 4–17, Jan. 2002.[23] B. Kitchenham and S. Charters, “Guidelines for performing systematicliterature reviews in software engineering,” 2007.[24] P. Rani, S. Panichella, M. Leuenberger, M. Ghafari, and O. Nierstrasz,“What do class comments tell us? An investigation of comment evolutionand practices in Pharo,” arXiv preprint arXiv:2005.11583 , 2020, toappear in Empirical Software Engineering.[25] T.-H. Chen, S. W. Thomas, and A. E. Hassan, “A survey on the useof topic models when mining software repositories,”

Empirical Softw.Engg. , vol. 21, no. 5, pp. 1843–1919, oct 2016. [Online]. Available:https://doi.org/10.1007/s10664-015-9402-8[26] N. Dragan, M. L. Collard, and J. I. Maletic, “Automatic identiﬁcationof class stereotypes,” in

Proceedings of the 2010 IEEE InternationalConference on Software Maintenance , ser. ICSM ’10. USA:IEEE Computer Society, 2010, p. 1–10. [Online]. Available: https://doi.org/10.1109/ICSM.2010.5609703[27] Y. Shinyama, Y. Arahori, and K. Gondow, “Analyzing code commentsto boost program comprehension,” in . IEEE, 2018, pp. 325–334.[28] I. K. Ratol and M. P. Robillard, “Detecting fragile comments,” in

Pro-ceedings of the 32Nd IEEE/ACM International Conference on AutomatedSoftware Engineering . IEEE Press, 2017, pp. 112–122.[29] F. Wen, C. Nagy, G. Bavota, and M. Lanza, “A large-scale empiricalstudy on code-comment inconsistencies,” in

Proceedings of the 27thInternational Conference on Program Comprehension . IEEE Press,2019, pp. 53–64.[30] B. Fluri, M. W¨ursch, E. Giger, and H. C. Gall, “Analyzing the co-evolution of comments and source code,”

Software Quality Journal ,vol. 17, no. 4, pp. 367–394, 2009.[31] H. Yu, B. Li, P. Wang, D. Jia, and Y. Wang, “Source code commentsquality assessment method based on aggregation of classiﬁcation algo-rithms,”

J. Comput. Appl. , vol. 36, no. 12, pp. 3448–3453, 2016.[32] W. Ding, P. Liang, A. Tang, and H. Van Vliet, “Knowledge-basedapproaches in software documentation: A systematic literature review,”

Information and Software Technology , vol. 56, no. 6, pp. 545–567, 2014.[33] J. Zhi, V. Garousi-Yusifo˘glu, B. Sun, G. Garousi, S. Shahnewaz, andG. Ruhe, “Cost, beneﬁts and quality of software development documen-tation: A systematic mapping,”