Gregorio Robles | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gregorio Robles is active.

Explore More

Publication

Featured researches published by Gregorio Robles.

hawaii international conference on system sciences | 2008

On the Inequality of Contributions to Wikipedia

Felipe Ortega; Jesus M. Gonzalez-Barahona; Gregorio Robles

Wikipedia is one of the most successful examples of massive collaborative content development. However, many of the mechanisms and procedures that it uses are still unknown in detail. For instance, how equal (or unequal) are the contributions to it has been discussed in the last years, with no conclusive results. In this paper, we study exactly that aspect by using Lorenz curves and Gini coefficients, very well known instruments to economists. We analyze the trends in the inequality of distributions for the ten biggest language editions of Wikipedia, and their evolution over time. As a result, we have found large differences in the number of contributions by different authors (something also observed in free, open source software development), and a trend to stable patterns of inequality in the long run.

international workshop on principles of software evolution | 2005

Evolution and growth in large libre software projects

Gregorio Robles; Juan Jose Amor; Jesus M. Gonzalez-Barahona; Israel Herraiz

Software evolution research has recently focused on new development paradigms, studying whether laws found in more classic development environments also apply. Previous works have pointed out that at least some laws seem not to be valid for these new environments and even Lehman has labeled those (up to the moment few) cases as anomalies and has suggested that further research is needed to clarify this issue. In this line, we consider in this paper a large set of libre (free, open source) software systems featuring a large community of users and developers. In particular, we analyze a number of projects found in literature up to now, including the Linux kernel. For comparison, we include other libre software kernels from the BSD family, and for completeness we consider a wider range of libre software applications. In the case of Linux and the other operating system kernels we have studied growth patterns also at the subsystem level. We have observed in the studied sample that super-linearity occurs only exceptionally, that many of the systems follow a linear growth pattern and that smooth growth is not that common. These results differ from the ones found generally in classical software evolution studies. Other behaviors and patterns give also a hint that development in the libre software world could follow different laws than those known, at least in some cases.

open source systems | 2006

Contributor Turnover in Libre Software Projects

Gregorio Robles; Jesus M. Gonzalez-Barahona

A common problem that management faces in software companies is the high instability of their staff. In libre (free, open source) software projects, the permanence of developers is also an open issue, with the potential of causing problems amplified by the self-organizing nature that most of them exhibit. Hence, human resources in libre software projects are even more difficult to manage: developers are in most cases not bound by a contract and, in addition, there is not a real management structure concerned about this problem. This raises some interesting questions with respect to the composition of development teams in libre software projects, and how they evolve over time. There are projects lead by their original founders (some sort of “code gods”), while others are driven by several different developer groups over time (i.e. the project “regenerates” itself). In this paper, we propose a quantitative methodology, based on the analysis of the activity in the source code management repositories, to study how these processes (developers leaving, developers joining) affect libre software projects. The basis of it is the analysis of the composition of the core group, the group of developers most active in a project, for several time lapses. We will apply this methodology to several large, well-known libre software projects, and show how it can be used to characterize them. In addition, we will discuss the lessons that can be learned, and the validity of our proposal.

mining software repositories | 2005

Developer identification methods for integrated data from various sources

Gregorio Robles; Jesus M. Gonzalez-Barahona

Studying a software project by mining data from a single repository has been a very active research field in software engineering during the last years. However, few efforts have been devoted to perform studies by integrating data from various repositories, with different kinds of information, which would, for instance, track the different activities of developers. One of the main problems of these multi-repository studies is the different identities that developers use when they interact with different tools in different contexts. This makes them appear as different entities when data is mined from different repositories (and in some cases, even from a single one). In this paper we propose an approach, based on the application of heuristics, to identify the many identities of developers in such cases, and a data structure for allowing both the anonymized distribution of information, and the tracking of identities for verification purposes. The methodology will be presented in general, and applied to the GNOME project as a case example. Privacy issues and partial merging with new data sources will also be considered and discussed.

Empirical Software Engineering | 2012

On the reproducibility of empirical software engineering studies based on data retrieved from development repositories

Jesus M. Gonzalez-Barahona; Gregorio Robles

Among empirical software engineering studies, those based on data retrieved from development repositories (such as those of source code management, issue tracking or communication systems) are specially suitable for reproduction. However their reproducibility status can vary a lot, from easy to almost impossible to reproduce. This paper explores which elements can be considered to characterize the reproducibility of a study in this area, and how they can be analyzed to better understand the type of reproduction studies they enable or obstruct. One of the main results of this exploration is the need of a systematic approach to asses the reproducibility of a study, due to the complexity of the processes usually involved, and the many details to be taken into account. To address this need, a methodology for assessing the reproducibility of studies is also presented and discussed, as a tool to help to raise awareness about research reproducibility in this field. The application of the methodology in practice has shown how, even for papers aimed to be reproducible, a systematic analysis raises important aspects that render reproduction difficult or impossible. We also show how, by identifying elements and attributes related to reproducibility, it can be better understood which kind of reproduction can be done for a specific study, given the description of datasets, methodologies and parameters it uses.

mining software repositories | 2008

Towards a simplification of the bug report form in eclipse

Israel Herraiz; Daniel M. German; Jesus M. Gonzalez-Barahona; Gregorio Robles

We believe that the bug report form of Eclipse contains too many fields, and that for some fields, there are too many options. In this MSR challenge report, we focus in the case of the severity field. That field contains seven different levels of severity. Some of them seem very similar, and it is hard to distinguish among them. Users assign severity, and developers give priority to the reports depending on their severity. However, if users can not distinguish well among the various severity options, they will probably assign different priorities to bugs that require the same priority. We study the mean time to close bugs reported in Eclipse, and how the severity assigned by users affects this time. The results shows that classifying by time to close, there are less clusters of bugs than levels of severity. We therefore conclude that there is a need to make a simpler bug report form.

International Journal of Information Technology and Web Engineering | 2006

Applying Social Network Analysis Techniques to Community-Driven Libre Software Projects

Luis Lopez-Fernandez; Gregorio Robles; Jesus M. Gonzalez-Barahona; Israel Herraiz

Source code management repositories of large, long-lived libre (free, open source) software projects can be a source of valuable data about the organizational structure, evolution, and knowledge exchange in the corresponding development communities. Unfortunately, the sheer volume of the available information renders it almost unusable without applying methodologies which highlight the relevant information for a given aspect of the project. Such methodology is proposed in this article, based on well known concepts from the social networks analysis field, which can be used to study the relationships among developers and how they collaborate in different parts of a project. It is also applied to data mined from some well known projects (Apache, GNOME, and KDE), focusing on the characterization of their collaboration network architecture. These cases help to understand the potentials of the methodology and how it is applied, but also shows some relevant results which open new paths in the understanding of the informal organization of libre software development communities.

mining software repositories | 2010

Replicating MSR: A study of the potential replicability of papers published in the Mining Software Repositories proceedings

Gregorio Robles

This paper is the result of reviewing all papers published in the proceedings of the former International Workshop on Mining Software Repositories (MSR) (2004–2006) and now Working Conference on MSR (2007–2009). We have analyzed the papers that contained any experimental analysis of software projects for their potentiality of being replicated. In this regard, three main issues have been addressed: i) the public availability of the data used as case study, ii) the public availability of the processed dataset used by researchers and iii) the public availability of the tools and scripts. A total number of 171 papers have been analyzed from the six workshops/working conferences up to date. Results show that MSR authors use in general publicly available data sources, mainly from free software repositories, but that the amount of publicly available processed datasets is very low. Regarding tools and scripts, for a majority of papers we have not been able to find any tool, even for papers where the authors explicitly state that they have built one. Lessons learned from the experience of reviewing the whole MSR literature and some potential solutions to lower the barriers of replicability are finally presented and discussed.

International Journal of Open Source Software and Processes | 2009

Tools for the Study of the Usual Data Sources found in Libre Software Projects

Gregorio Robles; Jesus M. Gonzalez-Barahona; Daniel Izquierdo-Cortazar; Israel Herraiz

Due to the open nature of Free/Libre/Open Source software projects, researchers have gained access to a rich set of development-related information. Although this information is publicly available on the Internet, obtaining and analyzing it in a convenient way is not an easy task and many considerations have to be taken into account. In this paper we present the most important data sources that can be found in libre software projects and that are studied by the research community: source code, source code management systems, mailing lists and bug tracking systems. We will give advice for the problems that can be found when retrieving and preparing the data sources for a posterior analysis, as well as provide information about the tools that support these tasks.

mining software repositories | 2009

Evolution of the core team of developers in libre software projects

Gregorio Robles; Jesus M. Gonzalez-Barahona; Israel Herraiz

In many libre (free, open source) software projects, most of the development is performed by a relatively small number of persons, the “core team”. The stability and permanence of this group of most active developers is of great importance for the evolution and sustainability of the project. In this position paper we propose a quantitative methodology to study the evolution of core teams by analyzing information from source code management repositories. The most active developers in different periods are identified, and their activity is calculated over time, looking for core team evolution patterns.

Explore More