Mika V. Mäntylä | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mika V. Mäntylä is active.

Explore More

Publication

Featured researches published by Mika V. Mäntylä.

Information & Software Technology | 2015

Using metrics in Agile and Lean Software Development - A systematic literature review of industrial studies

Eetu Kupiainen; Mika V. Mäntylä; Juha Itkonen

ContextSoftware industry has widely adopted Agile software development methods. Agile literature proposes a few key metrics but little is known of the actual metrics use in Agile teams. ObjectiveThe objective of this paper is to increase knowledge of the reasons for and effects of using metrics in industrial Agile development. We focus on the metrics that Agile teams use, rather than the ones used from outside by software engineering researchers. In addition, we analyse the influence of the used metrics. MethodThis paper presents a systematic literature review (SLR) on using metrics in industrial Agile software development. We identified 774 papers, which we reduced to 30 primary studies through our paper selection process. ResultsThe results indicate that the reasons for and the effects of using metrics are focused on the following areas: sprint planning, progress tracking, software quality measurement, fixing software process problems, and motivating people. Additionally, we show that although Agile teams use many metrics suggested in the Agile literature, they also use many custom metrics. Finally, the most influential metrics in the primary studies are Velocity and Effort estimate. ConclusionThe use of metrics in Agile software development is similar to Traditional software development. Projects and sprints need to be planned and tracked. Quality needs to be measured. Problems in the process need to be identified and fixed. Future work should focus on metrics that had high importance but low prevalence in our study, as they can offer the largest impact to the software industry.

mining software repositories | 2016

Mining valence, arousal, and dominance: possibilities for detecting burnout and productivity?

Mika V. Mäntylä; Bram Adams; Giuseppe Destefanis; Daniel Graziotin; Marco Ortu

Similar to other industries, the software engineering domain is plagued by psychological diseases such as burnout, which lead developers to lose interest, exhibit lower activity and/or feel powerless. Prevention is essential for such diseases, which in turn requires early identification of symptoms. The emotional dimensions of Valence, Arousal and Dominance (VAD) are able to derive a person’s interest (attraction), level of activation and perceived level of control for a particular situation from textual communication, such as emails. As an initial step towards identifying symptoms of productivity loss in software engineering, this paper explores the VAD metrics and their properties on 700,000 Jira issue reports containing over 2,000,000 comments, since issue reports keep track of a developer’s progress on addressing bugs or new features. Using a general-purpose lexicon of 14,000 English words with known VAD scores, our results show that issue reports of different type (e.g., Feature Request vs. Bug) have a fair variation of Valence, while increase in issue priority (e.g., from Minor to Critical) typically increases Arousal. Furthermore, we show that as an issue’s resolution time increases, so does the arousal of the individual the issue is assigned to. Finally, the resolution of an issue increases valence, especially for the issue Reporter and for quickly addressed issues. The existence ofsuch relations between VAD and issue report activities shows promise that text mining in the future could offer an alternative way for work health assessment surveys.

Information & Software Technology | 2016

When and what to automate in software testing? A multi-vocal literature review

Vahid Garousi; Mika V. Mäntylä

Abstract Context Many organizations see software test automation as a solution to decrease testing costs and to reduce cycle time in software development. However, establishment of automated testing may fail if test automation is not applied in the right time, right context and with the appropriate approach. Objective The decisions on when and what to automate is important since wrong decisions can lead to disappointments and major wrong expenditures (resources and efforts). To support decision making on when and what to automate, researchers and practitioners have proposed various guidelines, heuristics and factors since the early days of test automation technologies. As the number of such sources has increased, it is important to systematically categorize the current state-of-the-art and -practice, and to provide a synthesized overview. Method To achieve the above objective, we have performed a Multivocal Literature Review (MLR) study on when and what to automate in software testing. A MLR is a form of a Systematic Literature Review (SLR) which includes the grey literature (e.g., blog posts and white papers) in addition to the published (formal) literature (e.g., journal and conference papers). We searched the academic literature using the Google Scholar and the grey literature using the regular Google search engine. Results Our MLR and its results are based on 78 sources, 52 of which were grey literature and 26 were formally published sources. We used the qualitative analysis (coding) to classify the factors affecting the when- and what-to-automate questions to five groups: (1) Software Under Test (SUT)-related factors, (2) test-related factors, (3) test-tool-related factors, (4) human and organizational factors, and (5) cross-cutting and other factors. The most frequent individual factors were: need for regression testing (44 sources), economic factors (43), and maturity of SUT (39). Conclusion We show that current decision-support in software test automation provides reasonable advice for industry, and as a practical outcome of this research we have summarized it as a checklist that can be used by practitioners. However, we recommend developing systematic empirically-validated decision-support approaches as the existing advice is often unsystematic and based on weak empirical evidence.

Computer Science Review | 2016

Citations, research topics and active countries in software engineering

Vahid Garousi; Mika V. Mäntylä

Context: An enormous number of papers (more than 70,000) have been published in the area of Software Engineering (SE) since its inception in 1968. To better characterize and understand this massive research literature, there is a need for comprehensive bibliometrics assessments in this vibrant field.Objective: The objective of this study is to utilize automated citation and topic analysis to characterize the software engineering research literature over the years. While a few bibliometrics studies have appeared in the field of SE, this article aims to be the most comprehensive bibliometrics assessments in this vibrant field.Method: To achieve the above objective, we report in this paper a bibliometrics study with data collected from Scopus database consisting of over 70,000 articles. For thematic analysis, we used topic modeling to automatically generate the most probable topic distributions given the data.Results: We found that number of papers published per year has grown tremendously and currently 6000-7000 papers are published every year. At the same time, nearly half of the papers are not cited at all. Using text mining of articles titles, we found that currently the hot research topics in software engineering are: (1) web services, (2) mobile and cloud computing, (3) industrial (case) studies, (4) source code and (5) test generation. Finally, we found that a small share of large countries produce the majority of the papers in SE while small European countries are proportionally the most active in the area of SE, based on the number of papers.Conclusion: Due to large volumes of research in SE, we suggest using the automated analysis of bibliometrics as we have done in this paper. By picking out the most cited papers, we can present the land marks of SE and, with thematic analysis, we can characterize the entire field. This can be useful for students and other new comers to SE and for presenting our achievements to other disciplines. In particular, we see and report the value of such an analysis in situations where performing a full scale SLR is not feasible due to restrictions on time or to lack of exact research questions.

international conference on software testing verification and validation | 2015

Prioritizing Manual Test Cases in Traditional and Rapid Release Environments

Hadi Hemmati; Zhihan Fang; Mika V. Mäntylä

Test case prioritization is one of the most practically useful activities in testing, specially for large scale systems. The goal is ranking the existing test cases in a way that they detect faults as soon as possible, so that any partial execution of the test suite detects maximum number of defects for the given budget. Test prioritization becomes even more important when the test execution is time consuming, e.g., manual system tests vs. automated unit tests. Most existing test case prioritization techniques are based on code coverage, which requires access to source code. However, manual testing is mainly done in a black- box manner (manual testers do not have access to the source code). Therefore, in this paper, we first examine the existing test case prioritization techniques and modify them to be applicable on manual black-box system testing. We specifically study a coverage- based, a diversity-based, and a risk driven approach for test case prioritization. Our empirical study on four older releases of Mozilla Firefox shows that none of the techniques are strongly dominating the others in all releases. However, when we study nine more recent releases of Firefox, where the development has been moved from a traditional to a more agile and rapid release environment, we see a very signifiant difference (on average 65% effectiveness improvement) between the risk-driven approach and its alternatives. Our conclusion, based on one case study of 13 releases of an industrial system, is that test suites in rapid release environments, potentially, can be very effectively prioritized for execution, based on their historical riskiness; whereas the same conclusions do not hold in the traditional software development environments.

Computer Science Review | 2018

The evolution of sentiment analysis—A review of research topics, venues, and top cited papers

Mika V. Mäntylä; Daniel Graziotin; Miikka Kuutila

Sentiment analysis is one of the fastest growing research areas in computer science, making it challenging to keep track of all the activities in the area. We present a computer-assisted literature review, where we utilize both text mining and qualitative coding, and analyze 6,996 papers from Scopus. We find that the roots of sentiment analysis are in the studies on public opinion analysis at the beginning of 20th century and in the text subjectivity analysis performed by the computational linguistics community in 1990s. However, the outbreak of computer-based sentiment analysis only occurred with the availability of subjective texts on the Web. Consequently, 99% of the papers have been published after 2004. Sentiment analysis papers are scattered to multiple publication venues, and the combined number of papers in the top-15 venues only represent ca. 30% of the papers in total. We present the top-20 cited papers from Google Scholar and Scopus and a taxonomy of research topics. In recent years, sentiment analysis has shifted from analyzing online product reviews to social media texts from Twitter and Facebook. Many topics beyond product reviews like stock markets, elections, disasters, medicine, software engineering and cyberbullying extend the utilization of sentiment analysis

empirical software engineering and measurement | 2015

Citation and Topic Analysis of the ESEM Papers

Päivi Raulamo-Jurvanen; Mika V. Mäntylä; Vahid Garousi

Context: The pool of papers published in ESEM. Objective: To utilize citation analysis and automated topic analysis to characterize the SE research literature over the years focusing on those papers published in ESEM. Method: We collected data from Scopus database consisting of 513 ESEM papers. For thematic analysis, we used topic modeling to automatically generate the most probable topic distributions given the data. Results: Nearly 42% of the papers have not been cited at all but the effect seems to wear off as time passes. Using text mining of article titles and abstracts, we found that currently the most popular research topics in the ESEM community are: systematic reviews, testing, defects, cost estimation, and team work. Conclusions: While this study analyzes the paper pool of the ESEM symposium, the approach can easily be applied to any other sub-set of SE papers to conduct large scale studies. Due to large volumes of research in SE, we suggest using the automated analysis of bibliometrics as we have done in this paper.

Information & Software Technology | 2017

A benchmark study on the effectiveness of search-based data selection and feature selection for cross project defect prediction

Seyedrebvar Hosseini; Burak Turhan; Mika V. Mäntylä

Abstract Context Previous studies have shown that steered training data or dataset selection can lead to better performance for cross project defect prediction(CPDP). On the other hand, feature selection and data quality are issues to consider in CPDP. Objective We aim at utilizing the Nearest Neighbor (NN)-Filter, embedded in genetic algorithm to produce validation sets for generating evolving training datasets to tackle CPDP while accounting for potential noise in defect labels. We also investigate the impact of using different feature sets. Method We extend our proposed approach, Genetic Instance Selection (GIS), by incorporating feature selection in its setting. We use 41 releases of 11 multi-version projects to assess the performance GIS in comparison with benchmark CPDP (NN-filter and Naive-CPDP) and within project (Cross-Validation(CV) and Previous Releases(PR)). To assess the impact of feature sets, we use two sets of features, SCM+OO+LOC(all) and CK+LOC(ckloc) as well as iterative info-gain subsetting(IG) for feature selection. Results GIS variant with info gain feature selection is significantly better than NN-Filter (all,ckloc,IG) in terms of F1 ( p = v a l u e s ≪ 0.001 , Cohen’s d = { 0.621 , 0.845 , 0.762 } ) and G ( p = v a l u e s ≪ 0.001 , Cohen’s d = { 0.899 , 1.114 , 1.056 } ), and Naive CPDP (all,ckloc,IG) in terms of F1 ( p = v a l u e s ≪ 0.001 , Cohen’s d = { 0.743 , 0.865 , 0.789 } ) and G ( p = v a l u e s ≪ 0.001 , Cohen’s d = { 1.027 , 1.119 , 1.050 } ). Overall, the performance of GIS is comparable to that of within project defect prediction (WPDP) benchmarks, i.e. CV and PR. In terms of multiple comparisons test, all variants of GIS belong to the top ranking group of approaches. Conclusions We conclude that datasets obtained from search based approaches combined with feature selection techniques is a promising way to tackle CPDP. Especially, the performance comparison with the within project scenario encourages further investigation of our approach. However, the performance of GIS is based on high recall in the expense of a loss in precision. Using different optimization goals, utilizing other validation datasets and other feature selection techniques are possible future directions to investigate.

predictive models in software engineering | 2016

Search Based Training Data Selection For Cross Project Defect Prediction

Seyedrebvar Hosseini; Burak Turhan; Mika V. Mäntylä

Context: Previous studies have shown that steered training data or dataset selection can lead to better performance for cross project defect prediction (CPDP). On the other hand, data quality is an issue to consider in CPDP. Aim: We aim at utilising the Nearest Neighbor (NN)-Filter, embedded in a genetic algorithm, for generating evolving training datasets to tackle CPDP, while accounting for potential noise in defect labels. Method: We propose a new search based training data (i.e., instance) selection approach for CPDP called GIS (Genetic Instance Selection) that looks for solutions to optimize a combined measure of F-Measure and GMean, on a validation set generated by (NN)-filter. The genetic operations consider the similarities in features and address possible noise in assigned defect labels. We use 13 datasets from PROMISE repository in order to compare the performance of GIS with benchmark CPDP methods, namely (NN)-filter and naive CPDP, as well as with within project defect prediction (WPDP). Results: Our results show that GIS is significantly better than (NN)-Filter in terms of F-Measure (p -- value ≪ 0.001, Cohens d = 0.697) and GMean (p -- value ≪ 0.001, Cohens d = 0.946). It also outperforms the naive CPDP approach in terms of F-Measure (p -- value ≪ 0.001, Cohens d = 0.753) and GMean (p -- value ≪ 0.001, Cohens d = 0.994). In addition, the performance of our approach is better than that of WPDP, again considering F-Measure (p -- value ≪ 0.001, Cohens d = 0.227) and GMean (p -- value ≪ 0.001, Cohens d = 0.595) values. Conclusions: We conclude that search based instance selection is a promising way to tackle CPDP. Especially, the performance comparison with the within project scenario encourages further investigation of our approach. However, the performance of GIS is based on high recall in the expense of low precision. Using different optimization goals, e.g. targeting high precision, would be a future direction to investigate.

evaluation and assessment in software engineering | 2017

Industry-academia collaborations in software engineering: An empirical analysis of challenges, patterns and anti-patterns in research projects

Vahid Garousi; Michael Felderer; João M. Fernandes; Dietmar Pfahl; Mika V. Mäntylä

Research collaboration between industry and academia supports improvement and innovation in industry and helps to ensure industrial relevance in academic research. However, many researchers and practitioners believe that the level of joint industry-academia collaboration (IAC) in software engineering (SE) research is still relatively low, compared to the amount of activity in each of the two communities. The goal of the empirical study reported in this paper is to exploratory characterize the state of IAC with respect to a set of challenges, patterns and anti-patterns identified by a recent Systematic Literature Review study. To address the above goal, we gathered the opinions of researchers and practitioners w.r.t. their experiences in IAC projects. Our dataset includes 47 opinion data points related to a large set of projects conducted in 10 different countries. We aim to contribute to the body of evidence in the area of IAC, for the benefit of researchers and practitioners in conducting future successful IAC projects in SE. As an output, the study presents a set of empirical findings and evidence-based recommendations to increase the success of IAC projects.

Explore More