Maëlick Claes | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Maëlick Claes is active.

Explore More

Publication

Featured researches published by Maëlick Claes.

mining software repositories | 2017

Abnormal working hours: effect of rapid releases and implications to work content

Maëlick Claes; Mika V. Mäntylä; Miikka Kuutila; Bram Adams

During the past years, overload at work leading to psychological diseases, such as burnouts, have drawn more public attention. This paper is a preliminary step toward an analysis of the work patterns and possible indicators of overload and time pressure on software developers with mining software repositories approach. We explore the working pattern of developers in the context of Mozilla Firefox, a large and long-lived open source project. To that end we investigate the impact of the move from traditional to rapid release cycle on work pattern. Moreover we compare Mozilla Firefox work pattern with another Mozilla product, Firefox OS, which has a different release cycle than Firefox. We find that both projects exhibit healthy working patterns, i.e. lower activity during the weekends and outside of office hours. Firefox experiences proportionally more activity on weekends than Firefox OS (Cohens d = 0.94). We find that switching to rapid releases has reduced weekend work (Cohens d = 1.43) and working during the night (Cohens d = 0.45). This result holds even when we limit the analyzes on the hired resources, i.e. considering only individuals with Mozilla foundation email address, although, the effect sizes are smaller for weekends (Cohens d = 0.64) and nights (Cohens d = 0.23). Moreover, we use dissimilarity word clouds and find that work during the weekend is more technical while work during the week expresses more positive sentiment with words like good and nice. Our results suggest that moving to rapid releases have positive impact on the work health and work-life-balance of software engineers. However, caution is needed as our results are based on a limited set of quantitative data from a single organization.

international conference on software engineering | 2018

Do programmers work at night or during the weekend

Maëlick Claes; Mika V. Mäntylä; Miikka Kuutila; Bram Adams

Abnormal working hours can reduce work health, general well-being, and productivity, independent from a profession. To inform future approaches for automatic stress and overload detection, this paper establishes empirically collected measures of the work patterns of software engineers. To this aim, we perform the first largescale study of software engineers working hours by investigating the time stamps of commit activities of 86 large open source software projects, both containing hired and volunteer developers. We find that two thirds of software engineers mainly follow typical office hours, empirically established to be from 10h to 18h, and do not usually work during nights and weekends. Large variations between projects and individuals exist. Surprisingly, we found no support that project maturation would decrease abnormal working hours. In the Firefox case study, we found that hired developers work more during office hours while seniority, either in terms of number of commits or job status, did not impact working hours. We conclude that the use of working hours or timestamps of work products for stress detection requires establishing baselines at the level of individuals.

Proceedings of the 3rd International Workshop on Emotion Awareness in Software Engineering | 2018

Daily questionnaire to assess self-reported well-being during a software development project

Miikka Kuutila; Mika V. Mäntylä; Maëlick Claes; Marko Elovainio

According to authors best knowledge, this workshop paper makes two novel extensions to software engineering research. First, we create and execute a daily questionnaire monitoring the work well-being of software developers through a period of eight months. Second, we utilize statistical methods developed for discovering psychological dynamics to analyze this data. Our questionnaire includes elements from job satisfaction surveys and one software development specific element. The data were collected every day for a period of 8 months in a single software development project producing 526 answers from eight developers. The preliminary analysis shows the strongest correlations between hurry and interruptions. Additionally, we constructed temporal and contemporaneous network models used for discovering psychological dynamics from the questionnaire responses. In the future, we will try to establish links between the survey responses and the measures collected by conducting software repository mining and sentiment analysis.

mining software repositories | 2018

Natural language or not (NLON): a package for software engineering text analysis pipeline

Mika V. Mäntylä; Fabio Calefato; Maëlick Claes

The use of natural language processing (NLP) is gaining popularity in software engineering. In order to correctly perform NLP, we must pre-process the textual information to separate natural language from other information, such as log messages, that are often part of the communication in software engineering. We present a simple approach for classifying whether some textual input is natural language or not. Although our NLoN package relies on only 11 language features and character tri-grams, we are able to achieve an area under the ROC curve performances between 0.976-0.987 on three different data sources, with Lasso regression from Glmnet as our learner and two human raters for providing ground truth. Cross-source prediction performance is lower and has more fluctuation with top ROC performances from 0.913 to 0.980. Compared with prior work, our approach offers similar performance but is considerably more lightweight, making it easier to apply in software engineering text mining pipelines. Our source code and data are provided as an R-package for further improvements.

mining software repositories | 2018

Towards automatically identifying paid open source developers

Maëlick Claes; Mika V. Mäntylä; Miikka Kuutila; Umar Farooq

Open source development contains contributions from both hired and volunteer software developers. Identification of this status is important when we consider the transferability of research results to the closed source software industry, as they include no volunteer developers. While many studies have taken the employment status of developers into account, this information is often gathered manually due to the lack of accurate automatic methods. In this paper, we present an initial step towards predicting paid and unpaid open source development using machine learning and compare our results with automatic techniques used in prior work. By relying on code source repository meta-data from Mozilla, and manually collected employment status, we built a dataset of the most active developers, both volunteer and hired by Mozilla. We define a set of metrics based on developers usual commit time pattern and use different classification methods (logistic regression, classification tree, and random forest). The results show that our proposed method identify paid and unpaid commits with an AUC of 0.75 using random forest, which is higher than the AUC of 0.64 obtained with the best of the previously used automatic methods.

empirical software engineering and measurement | 2018

On the use of emoticons in open source software development

Maëlick Claes; Mika V. Mäntylä; Umar Farooq

Background: Using sentiment analysis to study software developers behavior comes with challenges such as the presence of a large amount of technical discussion unlikely to express any positive or negative sentiment. However, emoticons provide information about developer sentiments that can easily be extracted from software repositories. Aim: We investigate how software developers use emoticons differently in issue trackers in order to better understand the differences between developers and determine to which extent emoticons can be used as in place of sentiment analysis. Method: We extract emoticons from 1.3M comments from Apaches issue tracker and 4.5M from Mozillas issue tracker using regular expressions built from a list of emoticons used by SentiStrength and Wikipedia. We check for statistical differences using Mann-Whitney U tests and determine the effect size with Cliffs δ. Results: Overall Mozilla developers rely more on emoticons than Apache developers. While the overall rate of comments with emoticons is of 1% and 3% for Apache and Mozilla, some individual developers can have a rate up to 21%. Looking specifically at Mozilla developers, we find that western developers use significantly more emoticons (with medium size effect) than eastern developers. While the majority of emoticons are used to express joy, we find that Mozilla developers use emoticons more frequently to express sadness and surprise than Apache developers. Finally, we find that Apache developers use overall more emoticons during weekends than during weekdays, with the share of sad and surprised emoticons increasing during weekends. Conclusions: While emoticons are primarily used to express joy, the more occasional use of sad and surprised emoticons can potentially be utilized to detect frustration in place of sentiment analysis among developers using emoticons frequently enough.

empirical software engineering and measurement | 2018

Measuring LDA topic stability from clusters of replicated runs

Mika V. Mäntylä; Maëlick Claes; Umar Farooq

Background: Unstructured and textual data is increasing rapidly and Latent Dirichlet Allocation (LDA) topic modeling is a popular data analysis methods for it. Past work suggests that instability of LDA topics may lead to systematic errors. Aim: We propose a method that relies on replicated LDA runs, clustering, and providing a stability metric for the topics. Method: We generate k LDA topics and replicate this process n times resulting in n*k topics. Then we use K-medioids to cluster the n*k topics to k clusters. The k clusters now represent the original LDA topics and we present them like normal LDA topics showing the ten most probable words. For the clusters, we try multiple stability metrics, out of which we recommend Rank-Biased Overlap, showing the stability of the topics inside the clusters. Results: We provide an initial validation where our method is used for 270,000 Mozilla Firefox commit messages with k=20 and n=20. We show how our topic stability metrics are related to the contents of the topics. Conclusions: Advances in text mining enable us to analyze large masses of text in software engineering but non-deterministic algorithms, such as LDA, may lead to unreplicable conclusions. Our approach makes LDA stability transparent and is also complementary rather than alternative to many prior works that focus on LDA parameter tuning.

empirical software engineering and measurement | 2018

Using experience sampling to link software repositories with emotions and work well-being

Miikka Kuutila; Mika V. Mäntylä; Maëlick Claes; Marko Elovainio; Bram Adams

Background: The experience sampling method studies everyday experiences of humans in natural environments. In psychology it has been used to study the relationships between work well-being and productivity. To our best knowledge, daily experience sampling has not been previously used in software engineering. Aims: Our aim is to identify links between software developers self-reported affective states and work well-being and measures obtained from software repositories. Method: We perform an experience sampling study in a software company for a period of eight months, we use logistic regression to link the well-being measures with development activities, i.e. number of commits and chat messages. Results: We find several significant relationships between questionnaire variables and software repository variables. To our surprise relationship between hurry and number of commits is negative, meaning more perceived hurry is linked with a smaller number of commits. We also find a negative relationship between social interaction and hindered work well-being. Conclusions: The negative link between commits and hurry is counter-intuitive and goes against previous lab-experiments in software engineering that show increased efficiency under time pressure. Overall, our is an initial step in using experience sampling in software engineering and validating theories on work well-being from other fields in the domain of software engineering.

2017 IEEE/ACM 2nd International Workshop on Emotion Awareness in Software Engineering (SEmotion) | 2017