Bogdan Vasilescu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Bogdan Vasilescu is active.

Explore More

Publication

Featured researches published by Bogdan Vasilescu.

conference on computer supported cooperative work | 2014

How social Q&A sites are changing knowledge sharing in open source software communities

Bogdan Vasilescu; Alexander Serebrenik; Premkumar T. Devanbu; Vladimir Filkov

Historically, mailing lists have been the preferred means for coordinating development and user support activities. With the emergence and popularity growth of social Q&A sites such as the StackExchange network (e.g., StackOverflow), this is beginning to change. Such sites offer different socio-technical incentives to their participants than mailing lists do, e.g., rich web environments to store and manage content collaboratively, or a place to showcase their knowledge and expertise more vividly to peers or potential recruiters. A key difference between StackExchange and mailing lists is gamification, i.e., StackExchange participants compete to obtain reputation points and badges. In this paper, we use a case study of R (a widely-used tool for data analysis) to investigate how mailing list participation has evolved since the launch of StackExchange. Our main contribution is the assembly of a joint data set from the two sources, in which participants in both the texttt{r-help} mailing list and StackExchange are identifiable. This permits their activities to be linked across the two resources and also over time. With this data set we found that user support activities show a strong shift away from texttt{r-help}. In particular, mailing list experts are migrating to StackExchange, where their behaviour is different. First, participants active both on texttt{r-help} and on StackExchange are more active than those who focus exclusively on only one of the two. Second, they provide faster answers on StackExchange than on texttt{r-help}, suggesting they are motivated by the emph{gamified} environment. To our knowledge, our study is the first to directly chart the changes in behaviour of specific contributors as they migrate into gamified environments, and has important implications for knowledge management in software engineering.

human factors in computing systems | 2015

Gender and Tenure Diversity in GitHub Teams

Bogdan Vasilescu; Daryl Posnett; Baishakhi Ray; Mark van den Brand; Alexander Serebrenik; Premkumar T. Devanbu; Vladimir Filkov

Software development is usually a collaborative venture. Open Source Software (OSS) projects are no exception; indeed, by design, the OSS approach can accommodate teams that are more open, geographically distributed, and dynamic than commercial teams. This, we find, leads to OSS teams that are quite diverse. Team diversity, predominantly in offline groups, is known to correlate with team output, mostly with positive effects. How about in OSS? Using GitHub, the largest publicly available collection of OSS projects, we studied how gender and tenure diversity relate to team productivity and turnover. Using regression modeling of GitHub data and the results of a survey, we show that both gender and tenure diversity are positive and significant predictors of productivity, together explaining a sizable fraction of the data variability. These results can inform decision making on all levels, leading to better outcomes in recruiting and performance.

foundations of software engineering | 2015

Quality and productivity outcomes relating to continuous integration in GitHub

Bogdan Vasilescu; Yue Yu; Huaimin Wang; Premkumar T. Devanbu; Vladimir Filkov

Software processes comprise many steps; coding is followed by building, integration testing, system testing, deployment, operations, among others. Software process integration and automation have been areas of key concern in software engineering, ever since the pioneering work of Osterweil; market pressures for Agility, and open, decentralized, software development have provided additional pressures for progress in this area. But do these innovations actually help projects? Given the numerous confounding factors that can influence project performance, it can be a challenge to discern the effects of process integration and automation. Software project ecosystems such as GitHub provide a new opportunity in this regard: one can readily find large numbers of projects in various stages of process integration and automation, and gather data on various influencing factors as well as productivity and quality outcomes. In this paper we use large, historical data on process metrics and outcomes in GitHub projects to discern the effects of one specific innovation in process automation: continuous integration. Our main finding is that continuous integration improves the productivity of project teams, who can integrate more outside contributions, without an observable diminishment in code quality.

Empirical Software Engineering | 2014

On the variation and specialisation of workload--A case study of the Gnome ecosystem community

Bogdan Vasilescu; Alexander Serebrenik; Mathieu Goeminne; Tom Mens

Most empirical studies of open source software repositories focus on the analysis of isolated projects, or restrict themselves to the study of the relationships between technical artifacts. In contrast, we have carried out a case study that focuses on the actual contributors to software ecosystems, being collections of software projects that are maintained by the same community. To this aim, we defined a new series of workload and involvement metrics, as well as a novel approach—

Journal of Software: Evolution and Process | 2013

Software quality metrics aggregation in industry

Karine Mordal; Nicolas Anquetil; Jannik Laval; Alexander Serebrenik; Bogdan Vasilescu; Stéphane Ducasse

\widetilde{\mathbf{T}}

mining software repositories | 2014

Security and emotion: sentiment analysis of security discussions on GitHub

D Daniel Pletea; Bogdan Vasilescu; Alexander Serebrenik

-graphs—for reporting the results of comparing multiple distributions. We used these techniques to statistically study how workload and involvement of ecosystem contributors varies across projects and across activity types, and we explored to which extent projects and contributors specialise in particular activity types. Using Gnome as a case study we observed that, next to coding, the activities of localization, development documentation and building are prevalent throughout the ecosystem. We also observed notable differences between frequent and occasional contributors in terms of the activity types they are involved in and the number of projects they contribute to. Occasional contributors and contributors that are involved in many different projects tend to be more involved in the localization activity, while frequent contributors tend to be more involved in the coding activity in a limited number of projects.

international conference on software maintenance | 2011

You can't control the unfamiliar: A study on the relations between aggregation techniques for software metrics

Bogdan Vasilescu; Alexander Serebrenik; Mgj Mark van den Brand

With the growing need for quality assessment of entire software systems in the industry, new issues are emerging. First, because most software quality metrics are defined at the level of individual software components, there is a need for aggregation methods to summarize the results at the system level. Second, because a software evaluation requires the use of different metrics, with possibly widely varying output ranges, there is a need to combine these results into a unified quality assessment. In this paper we derive, from our experience on real industrial cases and from the scientific literature, requirements for an aggregation method. We then present a solution through the Squale model for metric aggregation, a model specifically designed to address the needs of practitioners. We empirically validate the adequacy of Squale through experiments on Eclipse. Additionally, we compare the Squale model to both traditional aggregation techniques (e.g., the arithmetic mean), and to econometric inequality indices (e.g., the Gini or the Theil indices), recently applied to aggregation of software metrics. Copyright

international conference on software maintenance | 2012

Who's who in Gnome: Using LSA to merge software repository identities

Etm Erik Kouters; Bogdan Vasilescu; Alexander Serebrenik; Mgj Mark van den Brand

Application security is becoming increasingly prevalent during software and especially web application development. Consequently, countermeasures are continuously being discussed and built into applications, with the goal of reducing the risk that unauthorized code will be able to access, steal, modify, or delete sensitive data. In this paper we gauged the presence and atmosphere surrounding security-related discussions on GitHub, as mined from discussions around commits and pull requests. First, we found that security related discussions account for approximately 10% of all discussions on GitHub. Second, we found that more negative emotions are expressed in security-related discussions than in other discussions. These findings confirm the importance of properly training developers to address security concerns in their applications as well as the need to test applications thoroughly for security vulnerabilities in order to reduce frustration and improve overall project atmosphere.

mining software repositories | 2015

Wait for it: determinants of pull request evaluation latency on GitHub

Yue Yu; Huaimin Wang; Vladimir Filkov; Premkumar T. Devanbu; Bogdan Vasilescu

A popular approach to assessing software maintainability and predicting its evolution involves collecting and analyzing software metrics. However, metrics are usually defined on a micro-level (method, class, package), and should therefore be aggregated in order to provide insights in the evolution at the macro-level (system). In addition to traditional aggregation techniques such as the mean, median, or sum, recently econometric aggregation techniques, such as the Gini, Theil, Kolm, Atkinson, and Hoover inequality indices have been proposed and applied to software metrics. In this paper we present the results of an extensive correlation study of the most widely-used traditional and econometric aggregation techniques, applied to lifting SLOC values from class to package level in the 106 systems comprising the Qualitas Corpus. Moreover, we investigate the nature of this relation, and study its evolution on a subset of 12 systems from the Qualitas Corpus. Our results indicate high and statistically significant correlation between the Gini, Theil, Atkinson, and Hoover indices, i.e., aggregation values obtained using these techniques convey the same information. However, we discuss some of the rationale behind choosing between one index or another.

workshop on emerging trends in software metrics | 2011

By no means: a study on aggregating software metrics

Bogdan Vasilescu; Alexander Serebrenik; Mgj Mark van den Brand

Understanding an individuals contribution to an ecosystem often necessitates integrating information from multiple repositories corresponding to different projects within the ecosystem or different kinds of repositories (e.g., mail archives and version control systems). However, recognising that different contributions belong to the same contributor is challenging, since developers may use different aliases. It is known that existing identity merging algorithms are sensitive to large discrepancies between the aliases used by the same individual: the noisier the data, the worse their performance. To assess the scale of the problem for a large software ecosystem, we study all Gnome Git repositories, classify the differences in aliases, and discuss robustness of existing algorithms with respect to these types of differences. We then propose a new identity merging algorithm based on Latent Semantic Analysis (LSA), designed to be robust against more types of differences in aliases, and evaluate it empirically by means of cross-validation on Gnome Git authors. Our results show a clear improvement over existing algorithms in terms of precision and recall on worst-case input data.

Explore More