Venera Arnaoudova | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Venera Arnaoudova is active.

Explore More

Publication

Featured researches published by Venera Arnaoudova.

IEEE Transactions on Software Engineering | 2014

REPENT: Analyzing the Nature of Identifier Renamings

Venera Arnaoudova; Laleh Mousavi Eshkevari; Massimiliano Di Penta; Giuliano Antoniol; Yann-Gaël Guéhéneuc

Source code lexicon plays a paramount role in software quality: poor lexicon can lead to poor comprehensibility and even increase software fault-proneness. For this reason, renaming a program entity, i.e., altering the entity identifier, is an important activity during software evolution. Developers rename when they feel that the name of an entity is not (anymore) consistent with its functionality, or when such a name may be misleading. A survey that we performed with 71 developers suggests that 39 percent perform renaming from a few times per week to almost every day and that 92 percent of the participants consider that renaming is not straightforward. However, despite the cost that is associated with renaming, renamings are seldom if ever documented-for example, less than 1 percent of the renamings in the five programs that we studied. This explains why participants largely agree on the usefulness of automatically documenting renamings. In this paper we propose REanaming Program ENTities (REPENT), an approach to automatically document-detect and classify-identifier renamings in source code. REPENT detects renamings based on a combination of source code differencing and data flow analyses. Using a set of natural language tools, REPENT classifies renamings into the different dimensions of a taxonomy that we defined. Using the documented renamings, developers will be able to, for example, look up methods that are part of the public API (as they impact client applications), or look for inconsistencies between the name and the implementation of an entity that underwent a high risk renaming (e.g., towards the opposite meaning). We evaluate the accuracy and completeness of REPENT on the evolution history of five open-source Java programs. The study indicates a precision of 88 percent and a recall of 92 percent. In addition, we report an exploratory study investigating and discussing how identifiers are renamed in the five programs, according to our taxonomy.

international conference on software maintenance | 2010

Physical and conceptual identifier dispersion: Measures and relation to fault proneness

Venera Arnaoudova; Laleh Mousavi Eshkevari; Yann-Gaël Guéhéneuc; Giuliano Antoniol

Poorly-chosen identifiers have been reported in the literature as misleading and increasing the program comprehension effort. Identifiers are composed of terms, which can be dictionary words, acronyms, contractions, or simple strings. We conjecture that the use of identical terms in different contexts may increase the risk of faults. We investigate our conjecture using a measure combining term entropy and term context coverage to study whether certain terms increase the odds ratios of methods to be fault-prone. Entropy measures the physical dispersion of terms in a program: the higher the entropy, the more scattered across the program the terms. Context coverage measures the conceptual dispersion of terms: the higher their context coverage, the more unrelated the methods using them. We compute term entropy and context coverage of terms extracted from identifiers in Rhino 1.4R3 and ArgoUML 0.16. We show statistically that methods containing terms with high entropy and context coverage are more fault-prone than others.

working conference on reverse engineering | 2012

Can Lexicon Bad Smells Improve Fault Prediction

Surafel Lemma Abebe; Venera Arnaoudova; Paolo Tonella; Giuliano Antoniol; Yann-Gaël Guéhéneuc

In software development, early identification of fault-prone classes can save a considerable amount of resources. In the literature, source code structural metrics have been widely investigated as one of the factors that can be used to identify faulty classes. Structural metrics measure code complexity, one aspect of the source code quality. Complexity might affect program understanding and hence increase the likelihood of inserting errors in a class. Besides the structural metrics, we believe that the quality of the identifiers used in the code may also affect program understanding and thus increase the likelihood of error insertion. In this study, we measure the quality of identifiers using the number of Lexicon Bad Smells (LBS) they contain. We investigate whether using LBS in addition to structural metrics improves fault prediction. To conduct the investigation, we assess the prediction capability of a model while using i) only structural metrics, and ii) structural metrics and LBS. The results on three open source systems, ArgoUML, Rhino, and Eclipse, indicate that there is an improvement in the majority of the cases.

Empirical Software Engineering | 2016

Linguistic antipatterns: what they are and how developers perceive them

Venera Arnaoudova; Massimiliano Di Penta; Giuliano Antoniol

Antipatterns are known as poor solutions to recurring problems. For example, Brown et al. and Fowler define practices concerning poor design or implementation solutions. However, we know that the source code lexicon is part of the factors that affect the psychological complexity of a program, i.e., factors that make a program difficult to understand and maintain by humans. The aim of this work is to identify recurring poor practices related to inconsistencies among the naming, documentation, and implementation of an entity—called Linguistic Antipatterns (LAs)—that may impair program understanding. To this end, we first mine examples of such inconsistencies in real open-source projects and abstract them into a catalog of 17 recurring LAs related to methods and attributes. Then, to understand the relevancy of LAs, we perform two empirical studies with developers—30 external (i.e., not familiar with the code) and 14 internal (i.e., people developing or maintaining the code). Results indicate that the majority of the participants perceive LAs as poor practices and therefore must be avoided—69 % and 51 % of the external and internal developers, respectively. As further evidence of LAs’ validity, open source developers that were made aware of LAs reacted to the issue by making code changes in 10 % of the cases. Finally, in order to facilitate the use of LAs in practice, we identified a subset of LAs which were universally agreed upon as being problematic; those which had a clear dissonance between code behavior and lexicon.

ieee international conference on software analysis evolution and reengineering | 2015

Would static analysis tools help developers with code reviews

Sebastiano Panichella; Venera Arnaoudova; Massimiliano Di Penta; Giuliano Antoniol

Code reviews have been conducted since decades in software projects, with the aim of improving code quality from many different points of view. During code reviews, developers are supported by checklists, coding standards and, possibly, by various kinds of static analysis tools. This paper investigates whether warnings highlighted by static analysis tools are taken care of during code reviews and, whether there are kinds of warnings that tend to be removed more than others. Results of a study conducted by mining the Gerrit repository of six Java open source projects indicate that the density of warnings only slightly vary after each review. The overall percentage of warnings removed during reviews is slightly higher than what previous studies found for the overall project evolution history. However, when looking (quantitatively and qualitatively) at specific categories of warnings, we found that during code reviews developers focus on certain kinds of problems. For such categories of warnings the removal percentage tend to be very high, often above 50% and sometimes up to 100%. Examples of those are warnings in the imports, regular expressions, and type resolution categories. In conclusion, while a broad warning detection might produce way too many false positives, enforcing the removal of certain warnings prior to the patch submission could reduce the amount of effort provided during the code review process.

international conference on software engineering | 2015

The use of text retrieval and natural language processing in software engineering

Sonia Haiduc; Venera Arnaoudova; Andrian Marcus; Giuliano Antoniol

This technical briefing presents the state of the art Text Retrieval and Natural Language Processing techniques used in Software Engineering and discusses their applications in the field.

mining software repositories | 2011

An exploratory study of identifier renamings

Laleh Mousavi Eshkevari; Venera Arnaoudova; Massimiliano Di Penta; Yann-Gaël Guéhéneuc; Giuliano Antoniol

Identifiers play an important role in source code understandability, maintainability, and fault-proneness. This paper reports a study of identifier renamings in software systems, studying how terms (identifier atomic components) change in source code identifiers. Specifically, the paper (i) proposes a term renaming taxonomy, (ii) presents an approximate lightweight code analysis approach to detect and classify term renamings automatically into the taxonomy dimensions, and (iii) reports an exploratory study of term renamings in two open-source systems, Eclipse-JDT and Tomcat. We thus report evidence that not only synonyms are involved in renamings but also (in a small fraction) more unexpected changes occur: surprisingly, we detected hypernym (a more abstract term, e.g., size vs. length) and hyponym (a more concrete term, e.g., restriction vs. rule) renamings, and antonym renamings (a term replaced with one having the opposite meaning, e.g., closing vs. opening). Despite being only a fraction of all renamings, synonym, hyponym, hypernym, and antonym renamings may hint at some program understanding issues and, thus, could be used in a renamingrecommendation system to improve code quality.

PeerJ | 2015

On the Impact of Sampling Frequency on Software Energy Measurements

Rubén Saborido; Venera Arnaoudova; Giovanni Beltrame; Foutse Khomh; Giuliano Antoniol

Energy consumption is a major concern when devel- oping and evolving mobile applications. The user wishes to access fast and powerful mobile applications, which is usually in contrast to optimized battery life and heat generation. The software engineering community have acknowledged the relevance of the problem and researchers are investigating ways to reduce energy consumption, for example by examining which library, device configuration, and applications parameters should be used to promote long battery life. We conjecture that these studies are at the border between hardware and software and we must be careful on how the energy consumption is measured and how the energy consumption is attributed to methods and libraries. To the best of our knowledge, no previous work investigates how much energy and power consumption is due to high frequency events missed when sampling at low frequencies such as 10 kHz and verified the error at the precision of method level. Low frequency sampling is a rough approximation that hinders the understanding of fine grain details: the real picture of energy consumption as well as the root causes are missed. This has profound implications on the choice of methods to evolve or components to replace. In this paper, we propose an approach for accurate measurements of the energy consumption of mobile applications. We apply the proposed approach to assess the energy consumption of 21 mobile, closed source, applications and four open source Android applications. We show that by sampling at 10 kHz one may expect a median error of 8%, however, such error may be as high as 50% for short fast executing methods. Finally, we revisit a previous approach that estimates the energy consumption of methods based on execution time and found that it can miss as much as 84% of the energy, with a median of 30%. Index Terms—Software Energy Consumption, Performance, Android, Monitoring.

Software Quality Journal | 2017

Investigating the relation between lexical smells and change- and fault-proneness: an empirical study

Latifa Guerrouj; Zeinab Kermansaravi; Venera Arnaoudova; Benjamin C. M. Fung; Foutse Khomh; Giuliano Antoniol; Yann-Gaël Guéhéneuc

Past and recent studies have shown that design smells which are poor solutions to recurrent design problems make object-oriented systems difficult to maintain, and that they negatively impact the class change- and fault-proneness. More recently, lexical smells have been introduced to capture recurring poor practices in the naming, documentation, and choice of identifiers during the implementation of an entity. Although recent studies show that developers perceive lexical smells as impairing program understanding, no study has actually evaluated the relationship between lexical smells and software quality as well as their interaction with design smells. In this paper, we detect 29 smells consisting of 13 design smells and 16 lexical smells in 30 releases of three projects: ANT, ArgoUML, and Hibernate. We analyze to what extent classes containing lexical smells have higher (or lower) odds to change or to be subject to fault fixing than other classes containing design smells. Our results show and bring empirical evidence on the fact that lexical smells can make, in some cases, classes with design smells more fault-prone. In addition, we empirically demonstrate that classes containing design smells only are more change- and fault-prone than classes with lexical smells only.

Journal of Software: Evolution and Process | 2014

SCAN: an approach to label and relate execution trace segments

Soumaya Medini; Venera Arnaoudova; Massimiliano Di Penta; Giuliano Antoniol; Yann-Gaël Guéhéneuc; Paolo Tonella

Identifying concepts in execution traces is a task often necessary to support program comprehension or maintenance activities. Several approaches -- static, dynamic or hybrid -- have been proposed to identify cohesive, meaningful sequence of methods in execution traces. However, none of the proposed approaches is able to label such segments and to identify relations between segments of the same trace. This paper present SCAN (Segment Concept AssigNer) an approach to assign labels to sequences of methods in execution traces, and to identify relations between such segments. SCAN uses information retrieval methods and formal concept analysis to produce sets of words helping the developer to understand the concept implemented by a segment. Specifically, formal concept analysis allows SCAN to discover commonalities between segments in different trace areas, as well as terms more specific to a given segment and high level relations between segments. The paper describes SCAN along with a preliminary manual validation -- upon execution traces collected from usage scenarios of JHotDraw and ArgoUML -- of SCAN accuracy in assigning labels representative of concepts implemented by trace segments.

Explore More