Ettore Merlo
École Polytechnique de Montréal
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ettore Merlo.
IEEE Transactions on Software Engineering | 2002
Giuliano Antoniol; Gerardo Canfora; G. Casazza; A. De Lucia; Ettore Merlo
Software system documentation is almost always expressed informally in natural language and free text. Examples include requirement specifications, design documents, manual pages, system development journals, error logs, and related maintenance reports. We propose a method based on information retrieval to recover traceability links between source code and free text documents. A premise of our work is that programmers use meaningful names for program items, such as functions, variables, types, classes, and methods. We believe that the application-domain knowledge that programmers process when writing the code is often captured by the mnemonics for identifiers; therefore, the analysis of these mnemonics can help to associate high-level concepts with program concepts and vice-versa. We apply both a probabilistic and a vector space information retrieval model in two case studies to trace C++ source code onto manual pages and Java code to functional requirements. We compare the results of applying the two models, discuss the benefits and limitations, and describe directions for improvements.
IEEE Transactions on Software Engineering | 2007
Stefan Bellon; Rainer Koschke; Giuliano Antoniol; Jens Krinke; Ettore Merlo
Many techniques for detecting duplicated source code (software clones) have been proposed in the past. However, it is not yet clear how these techniques compare in terms of recall and precision as well as space and time requirements. This paper presents an experiment that evaluates six clone detectors based on eight large C and Java programs (altogether almost 850 KLOC). Their clone candidates were evaluated by one of the authors as an independent third party. The selected techniques cover the whole spectrum of the state-of-the-art in clone detection. The techniques work on text, lexical and syntactic information, software metrics, and program dependency graphs.
international conference on software maintenance | 1997
Bruno Laguë; Daniel Proulx; Jean Mayrand; Ettore Merlo; John P. Hudepohl
The aim of the experiment presented in this paper is to present an insight into the evaluation of the potential benefits of introducing a function clone detection technology in an industrial software development process. To take advantage of function clone detection, two modifications to the software development process are presented. Our experiment consists of evaluating the impact that these proposed changes would have had on a specific software system if they had been applied over a 3 year period (involving 10000 person-months), where 6 subsequent versions of the software under study were released. The software under study is a large telecommunication system. In total 89 million lines of code have been analyzed. A first result showed that, against our expectations, a significant number of clones are being removed from the system over time. However, this effort is insufficient to prevent the growth of the overall number of clones in the system. In this context the first process change would have added value. We have also found that the second process change would have provided programmers with a significant number of opportunities for correcting problems before customers experienced them. This result shows a potential for improving the software system quality and customer satisfaction
working conference on reverse engineering | 2000
Magdalena Balazinska; Ettore Merlo; Michel Dagenais; Bruno Laguë; Kostas Kontogiannis
Manual source code copy and modification is often used by programmers as an easy means for functionality reuse. Nevertheless, such practice produces duplicated pieces of code or clones whose consistent maintenance might be difficult to achieve. It also creates implicit links between classes sharing a functionality. Clones are therefore good candidates for system redesign. This paper presents a novel approach for computer-aided clone-based object-oriented system refactoring. The approach is based on an advanced clone analysis which focuses on the extraction of clone differences and their interpretation in terms of programming language entities. It also focuses on the study of contextual dependencies of cloned methods. The clone analysis has been applied to JDK 1.1.5, a large scale system of 150 KLOC.
Information & Software Technology | 2002
Giuliano Antoniol; Umberto Villano; Ettore Merlo; M. Di Penta
Abstract Identifying code duplication in large multi-platform software systems is a challenging problem. This is due to a variety of reasons including the presence of high-level programming languages and structures interleaved with hardware-dependent low-level resources and assembler code, the use of GUI-based configuration scripts generating commands to compile the system, and the extremely high number of possible different configurations. This paper studies the extent and the evolution of code duplications in the Linux kernel. Linux is a large, multi-platform software system; it is based on the Open Source concept, and so there are no obstacles in discussing its implementation. In addition, it is decidedly too large to be examined manually: the current Linux kernel release (2.4.18) is about three million LOCs. Nineteen releases, from 2.4.0 to 2.4.18, were processed and analyzed, identifying code duplication among Linux subsystems by means of a metric-based approach. The obtained results support the hypothesis that the Linux system does not contain a relevant fraction of code duplication. Furthermore, code duplication tends to remain stable across releases, thus suggesting a fairly stable structure, evolving smoothly without any evidence of degradation.
ieee international software metrics symposium | 1999
Magdalena Balazinska; Ettore Merlo; Michel Dagenais; Bruno Laguë; Kostas Kontogiannis
Code duplication, plausibly caused by copying source code and slightly modifying it, is often observed in large systems. Clone detection and documentation have been investigated by several researchers in the past years. Recently, research focus has shifted towards the investigation of software and process restructuring actions based on clone detection. This paper presents an original definition of a clone classification scheme useful to assess and measure different system reengineering opportunities. The proposed classification considers each group of cloned methods in terms of the meaning of the differences existing between them. The algorithm used for automatic classification of clones is presented together with results obtained by classifying cloned methods and measuring reengineering opportunities in six freely available systems whose total size is about 500 KLOC of Java code.
international workshop on principles of software evolution | 2004
Giuliano Antoniol; M. Di Penta; Ettore Merlo
When a software system evolves, features are added, removed and changed. Moreover, refactoring activities are periodically performed to improve the software internal structure. A class may be replaced by another, two classes can be merged, or a class may be split in two others. As a consequence, it may not be possible to trace software features between a release and another. When studying software evolution, we should be able to trace a class lifetime even when it disappears because it is replaced by a similar one, split or merged. Such a capability is also essential to perform impact analysis. This work proposes an automatic approach, inspired on vector space information retrieval, to identify class evolution discontinuities and, therefore, cases of possible refactoring. The approach has been applied to identify refactorings performed over 40 releases of a Java open source domain name server. Almost all the refactorings found were actually performed in the analyzed system, thus indicating the helpfulness of the approach and of the developed tool.
American Journal of Human Genetics | 2005
Pavel Hamet; Ettore Merlo; Ondrej Seda; Ulrich Broeckel; Johanne Tremblay; Mary L. Kaldunski; Daniel Gaudet; Gérard Bouchard; B. Deslauriers; F. Gagnon; Giuliano Antoniol; Zdenka Pausova; Malgorzata Labuda; Michèle Jomphe; Francis Gossard; Gérald Tremblay; R. Kirova; Peter J. Tonellato; Sergei N. Orlov; J. Pintos; J. Platko; Thomas J. Hudson; John D. Rioux; Theodore A. Kotchen; Allen W. Cowley
The Saguenay-Lac St-Jean population of Quebec is relatively isolated and has genealogical records dating to the 17th-century French founders. In 120 extended families with at least one sib pair affected with early-onset hypertension and/or dyslipidemia, we analyzed the genetic determinants of hypertension and related cardiovascular and metabolic conditions. Variance-components linkage analysis revealed 46 loci after 100,000 permutations. The most prominent clusters of overlapping quantitative-trait loci were on chromosomes 1 and 3, a finding supported by principal-components and bivariate analyses. These genetic determinants were further tested by classifying families by use of LOD score density analysis for each measured phenotype at every 5 cM. Our study showed the founder effect over several generations and classes of living individuals. This quantitative genealogical approach supports the notion of the ancestral causality of traits uniquely present and inherited in distinct family classes. With the founder effect, traits determined within population subsets are measurably and quantitatively transmitted through generational lineage, with a precise component contributing to phenotypic variance. These methods should accelerate the uncovering of causal haplotypes in complex diseases such as hypertension and metabolic syndrome.
international conference on software maintenance | 2001
Giuliano Antoniol; G. Casazza; M. Di Penta; Ettore Merlo
The actual effort to evolve and maintain a software system is likely to vary depending on the amount of clones (i.e., duplicated or slightly different code fragments) present in the system. This paper presents a method for monitoring and predicting clones evolution across subsequent versions of a software system. Clones are firstly identified using a metric-based approach, then they are modeled in terms of time series identifying a predictive model. The proposed method has been validated with an experimental activity performed on 27 subsequent versions of mSQL, a medium-size software system written in C. The time span period of the analyzed mSQL releases covers four years, from May 1995 (mSQL 1.0.6) to May 1999 (mSQL 2. 0. 10). For any given software release, the identified models was able to predict the clone percentage of the subsequent release with an average error below 4 %. A higher prediction error was observed only in correspondence of major system redesign.
working conference on reverse engineering | 1999
Magdalena Balazinska; Ettore Merlo; Michel Dagenais; Bruno Laguë; Kostas Kontogiannis
Code duplication, plausibly caused by copying source code and slightly modifying it, is often observed in large systems. Clone detection and documentation have been investigated by several researchers in past years. Recently, research focus has shifted towards the investigation of software and process restructuring actions based on clone detection. The paper presents a new redesign approach developed for Java software systems. The approach factorizes the common parts of cloned methods and parameterizes their differences using the strategy design pattern. The new entities created by such transformations are also decoupled from the original contexts of their use, thus facilitating reuse and increasing maintainability. The applicability and automation of the technique presented in the paper have been verified by partially redesigning JDK 1.1.5.