Tudor Gîrba | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tudor Gîrba is active.

Explore More

Publication

Featured researches published by Tudor Gîrba.

Information & Software Technology | 2007

Semantic clustering: Identifying topics in source code

Adrian Kuhn; Stéphane Ducasse; Tudor Gîrba

Many of the existing approaches in Software Comprehension focus on program structure or external documentation. However, by analyzing formal information the informal semantics contained in the vocabulary of source code are overlooked. To understand software as a whole, we need to enrich software analysis with the developer knowledge hidden in the code naming. This paper proposes the use of information retrieval to exploit linguistic information found in source code, such as identifier names and comments. We introduce Semantic Clustering, a technique based on Latent Semantic Indexing and clustering to group source artifacts that use similar vocabulary. We call these groups semantic clusters and we interpret them as linguistic topics that reveal the intention of the code. We compare the topics to each other, identify links between them, provide automatically retrieved labels, and use a visualization to illustrate how they are distributed over the system. Our approach is language independent as it works at the level of identifier names. To validate our approach we applied it on several case studies, two of which we present in this paper. Note: Some of the visualizations presented make heavy use of colors. Please obtain a color copy of the article for better understanding.

international workshop on principles of software evolution | 2005

How developers drive software evolution

Tudor Gîrba; Adrian Kuhn; Mauricio Seeberger; Stéphane Ducasse

As systems evolve their structure change in ways not expected upfront. As time goes by, the knowledge of the developers becomes more and more critical for the process of understanding the system. That is, when we want to understand a certain issue of the system we ask the knowledgeable developers. Yet, in large systems, not every developer is knowledgeable in all the details of the system. Thus, we would want to know which developer is knowledgeable in the issue at hand. In this paper we make use of the mapping between the changes and the author identifiers (e.g., user names) provided by versioning repositories. We first define a measurement for the notion of code ownership. We use this measurement to define the ownership map visualization to understand when and how different developers interacted in which way and in which part of the system. We report the results we obtained on several large systems.

international conference on software maintenance | 2004

Yesterday's Weather: guiding early reverse engineering efforts by summarizing the evolution of changes

Tudor Gîrba; Stéphane Ducasse; Michele Lanza

Knowing where to start reverse engineering a large software system, when no information other than the systems source code itself is available, is a daunting task. Having the history of the code (i.e., the versions) could be of help if this would not imply analyzing a huge amount of data. We present an approach for identifying candidate classes for reverse engineering and reengineering efforts. Our solution is based on summarizing the changes in the evolution of object-oriented software systems by defining history measurements. Our approach, named Yesterdays Weather, is an analysis based on the retrospective empirical observation that classes which changed the most in the recent past also suffer important changes in the near future. We apply this approach on two case studies and show how we can obtain an overview of the evolution of a system and pinpoint its classes that might change in the next versions.

international conference on software maintenance | 2006

Distribution Map

Stéphane Ducasse; Tudor Gîrba; Adrian Kuhn

Understanding large software systems is a challenging task, and to support it many approaches have been developed. Often, the result of these approaches categorize existing entities into new groups or associates them with mutually exclusive properties. In this paper we present the distribution map as a generic technique to visualize and analyze this type of result. Our technique is based on the notion of focus, which shows whether a property is well-encapsulated or cross-cutting, and the notion of spread, which shows whether the property is present in several parts of the system. We present a basic visualization and complement it with measurements that quantify focus and spread. To validate our technique we show evidence of applying it on the result sets of different analysis approaches. As a conclusion we propose that the distribution map technique should belong to any reverse engineering toolkit

conference on software maintenance and reengineering | 2004

Using history information to improve design flaws detection

D. Rapu; Stéphane Ducasse; Tudor Gîrba; Radu Marinescu

As systems evolve and their structure decays, maintainers need accurate and automatic identification of the design problems. Current approaches for automatic detection of design problems are not accurate enough because they analyze only a single version of a system and consequently they miss essential information as design problems appear and evolve over time. Our approach is to use the historical information of the suspected flawed structure to increase the accuracy of the automatic problem detection. Our means is to define measurements which summarize how persistent the problem was and how much maintenance effort was spent on the suspected structure. We apply our approach on a large scale case study and show how it improves the accuracy of the detection of god classes and data classes, and additionally how it adds valuable semantical information about the evolution of flawed design structures.

working conference on reverse engineering | 2005

Enriching reverse engineering with semantic clustering

Adrian Kuhn; Stéphane Ducasse; Tudor Gîrba

Understanding a software system by just analyzing the structure of the system reveals only half of the picture, since the structure tells us only how the code is working but not what the code is about. What the code is about can be found in the semantics of the source code: names of identifiers, comments etc. In this paper, we analyze how these terms are spread over the source artifacts using latent semantic indexing, an information retrieval technique. We use the assumption that parts of the system that use similar terms are related. We cluster artifacts that use similar terms, and we reveal the most relevant terms for the computed clusters. Our approach works at the level of the source code which makes it language independent. Nevertheless, we correlated the semantics with structural information and we applied it at different levels of abstraction (e.g. classes, methods). We applied our approach on three large case studies and we report the results we obtained.

Journal of Software Maintenance and Evolution: Research and Practice | 2006

Modeling History to Analyze Software Evolution

Tudor Gîrba; Stéphane Ducasse

SUMMARY The histories of software systems hold useful information when reasoning about the systems at hand or when reasoning about general laws of software evolution. Over the past 30 years more and more research has been spent on understanding software evolution. However, the approaches developed so far do not rely on an explicit metamodel, and thus, they make it dicult to reuse or compare their results. We argue that there is a need for an explicit meta-model for software evolution analysis. We present a survey of the evolution analyses and deduce a set of requirements that an evolution meta-model should have. We define, Hismo, a meta-model in which history is modeled as an explicit entity. Hismo adds a time layer on top of structural information, and provides a common infrastructure for expressing and combining evolution analyses and structural analyses. We validate the usefulness of our a meta-model by presenting how dierent analyses are expressed on it.

conference on software maintenance and reengineering | 2005

Characterizing the evolution of class hierarchies

Tudor Gîrba; Michele Lanza; Stéphane Ducasse

Analyzing historical information can show how a software system evolved into its current state, which parts of the system are stable and which have changed more. However, historical analysis implies processing a vast amount of information making the interpretation of the results difficult. To address this issue, we introduce the notion of the history of source code artifacts as a first class entity and define measurements, which summarize the evolution of such entities. We use these measurements to define rules by which to detect different characteristics of the evolution of class hierarchies. Furthermore, we discuss the results we obtained by visualizing them using a polymetric view. We apply our approach on two large open source case studies and classify their class hierarchies based on their history.

software visualization | 2006

Mondrian: An Agile Visualization Framework

Michael Meyer; Tudor Gîrba; Mircea Lungu

Data visualization is the process of representing data as pictures to support reasoning about the underlying data. For the interpretation to be as easy as possible, we need to be as close as possible to the original data. As most visualization tools have an internal meta-model, which is different from the one for the presented data, they usually need to duplicate the original data to conform to their meta-model. This leads to an increase in the resources needed, increase which is not always justified. In this work we argue for the need of having an engine that is as close as possible to the data and we present our solution of moving the visualization tool to the data, instead of moving the data to the visualization tool. Our solution also emphasizes the necessity of reusing basic blocks to express complex visualizations and allowing the programmer to script the visualization using his preferred tools, rather than a third party format. As a validation of the expressiveness of our framework, we show how we express several already published visualizations and describe the pros and cons of the approach.

Science of Computer Programming | 2010

The Small Project Observatory: Visualizing software ecosystems

Mircea Lungu; Michele Lanza; Tudor Gîrba; Romain Robbes

Software evolution research has focused mostly on analyzing the evolution of single software systems. However, it is rarely the case that a project exists as standalone, independent of others. Rather, projects exist in parallel within larger contexts in companies, research groups or even the open-source communities. We call these contexts software ecosystems. In this paper, we present the Small Project Observatory, a prototype tool which aims to support the analysis of software ecosystems through interactive visualization and exploration. We present a case study of exploring an ecosystem using our tool, we describe the architecture of the tool, and we distill lessons learned during the tool-building experience.

Explore More