Akito Monden | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Akito Monden is active.

Explore More

Publication

Featured researches published by Akito Monden.

international conference on computational science | 2015

Benchmarking Software Maintenance Based on Working Time

Masateru Tsunoda; Akito Monden; Ken-ichi Matsumoto; Sawako Ohiwa; Tomoki Oshino

Software maintenance is an important activity on the software lifecycle. Software maintenance does not mean only removing faults found after software release. Software needs extensions or modifications of its functions due to changes in a business environment, and software maintenance also indicates them. In this research, we try to establish a benchmark of work efficiency for software maintenance. To establish the benchmark, factors affecting work efficiency should be clarified, using a dataset collected from various organizations (cross-company dataset). We used dataset includes 134 data points collected by Economic Research Association in 2012, and analyzed factors affected work efficiency of software maintenance. We defined the work efficiency as number of modified modules divided by working time. The main contribution of our research is illustrating factors affecting work efficiency, based on the analysis using cross-company dataset and working time. Also, we showed work efficiency, classified the factor. It can be used to benchmark an organization. We empirically illustrated that using Java and restriction of development tool affect to work efficiency.

evaluation and assessment in software engineering | 2015

Case consistency: a necessary data quality property for software engineering data sets

Passakorn Phannachitta; Akito Monden; Jacky Keung; Ken-ichi Matsumoto

Data quality is an essential aspect in any empirical study, because the validity of models and/or analysis results derived from an empirical data is inherently influenced by its quality. In this empirical study, we focus on data consistency as a critical factor influencing the accuracy of prediction models in software engineering. We propose a software metric called Cases Inconsistency Level (CIL) for analyzing conflicts within software engineering data sets by leveraging probability statistics on project cases and counting the number of conflicting pairs. The result demonstrated that CIL is able to be used as a metric to identify either consistent data sets or inconsistent data sets, which are valuable for building robust prediction models. In addition to measuring the level of consistency, CIL is proved to be applicable to predict whether or not an effort model built from data set can achieve higher accuracy, an important indicator for empirical experiments in software engineering.

empirical software engineering and measurement | 2017

The significant effects of data sampling approaches on software defect prioritization and classification

Kwabena Ebo Bennin; Jacky Keung; Akito Monden; Passakorn Phannachitta; Solomon Mensah

Context: Recent studies have shown that performance of defect prediction models can be affected when data sampling approaches are applied to imbalanced training data for building defect prediction models. However, the magnitude (degree and power) of the effect of these sampling methods on the classification and prioritization performances of defect prediction models is still unknown. Goal: To investigate the statistical and practical significance of using resampled data for constructing defect prediction models. Method: We examine the practical effects of six data sampling methods on performances of five defect prediction models. The prediction performances of the models trained on default datasets (no sampling method) are compared with that of the models trained on resampled datasets (application of sampling methods). To decide whether the performance changes are significant or not, robust statistical tests are performed and effect sizes computed. Twenty releases of ten open source projects extracted from the PROMISE repository are considered and evaluated using the AUC, pd, pf and G-mean performance measures. Results: There are statistical significant differences and practical effects on the classification performance (pd, pf and G-mean) between models trained on resampled datasets and those trained on the default datasets. However, sampling methods have no statistical and practical effects on defect prioritization performance (AUC) with small or no effect values obtained from the models trained on the resampled datasets. Conclusions: Existing sampling methods can properly set the threshold between buggy and clean samples, while they cannot improve the prediction of defect-proneness itself. Sampling methods are highly recommended for defect classification purposes when all faulty modules are to be considered for testing.

Empirical Software Engineering | 2017

A stability assessment of solution adaptation techniques for analogy-based software effort estimation

Passakorn Phannachitta; Jacky Keung; Akito Monden; Kenichi Matsumoto

Among numerous possible choices of effort estimation methods, analogy-based software effort estimation based on Case-based reasoning is one of the most adopted methods in both the industry and research communities. Solution adaptation is the final step of analogy-based estimation, employed to aggregate and adapt to solutions derived during the case-based reasoning process. Variants of solution adaptation techniques have been proposed in previous studies; however, the ranking of these techniques is not conclusive and shows conflicting results, since different studies rank these techniques in different ways. This paper aims to find a stable ranking of solution adaptation techniques for analogy-based estimation. Compared with the existing studies, we evaluate 8 commonly adopted solution techniques with more datasets (12), more feature selection techniques included (4), and more stable error measures (5) to a robust statistical test method based on the Brunner test. This comprehensive experimental procedure allows us to discover a stable ranking of the techniques applied, and to observe similar behaviors from techniques with similar adaptation mechanisms. In general, the linear adaptation techniques based on the functions of size and productivity (e.g., regression towards the mean technique) outperform the other techniques in a more robust experimental setting adopted in this study. Our empirical results show that project features with strong correlation to effort, such as software size or productivity, should be utilized in the solution adaptation step to achieve desirable performance. Designing a solution adaptation strategy in analogy-based software effort estimation requires careful consideration of those influential features to ensure its prediction is of relevant and accurate.

computer software and applications conference | 2016

Investigating the Effects of Balanced Training and Testing Datasets on Effort-Aware Fault Prediction Models

Kwabena Ebo Bennin; Jacky Keung; Akito Monden; Yasutaka Kamei; Naoyasu Ubayashi

To prioritize software quality assurance efforts, fault prediction models have been proposed to distinguish faulty modules from clean modules. The performances of such models are often biased due to the skewness or class imbalance of the datasets considered. To improve the prediction performance of these models, sampling techniques have been employed to rebalance the distribution of fault-prone and non-fault-prone modules. The effect of these techniques have been evaluated in terms of accuracy/geometric mean/F1-measure in previous studies; however, these measures do not consider the effort needed to fix faults. To empirically investigate the effect of sampling techniques on the performance of software fault prediction models in a more realistic setting, this study employs Norm(Popt), an effort-aware measure that considers the testing effort. We performed two sets of experiments aimed at (1) assessing the effects of sampling techniques on effort-aware models and finding the appropriate class distribution for training datasets (2) investigating the role of balanced training and testing datasets on performance of predictive models. Of the four sampling techniques applied, the over-sampling techniques outperformed the under-sampling techniques with Random Over-sampling performing best with respect to the Norm(Popt) evaluation measure. Also, performance of all the prediction models improved when sampling techniques were applied between the rates of (20-30)% on the training datasets implying that a strictly balanced dataset (50% faulty modules and 50% clean modules) does not result in the best performance for effort-aware models. Our results also indicate that performances of effort-aware models are significantly dependent on the proportions of the two types of the classes in the testing dataset. Models trained on moderately balanced datasets are more likely to withstand fluctuations in performance as the class distribution in the testing data varies.

frontiers in education conference | 2015

Programming education for primary school children using a textual programming language

Hidekuni Tsukamoto; Yasuhiro Takemura; Hideo Nagumo; Isamu Ikeda; Akito Monden; Ken-ichi Matsumoto

In this research, a Textual Programming Language (TPL) is used in programming education for primary schoolchildren because of the following reasons: (1) it is more practical to use the programming languages similar to the ones used for developing real applications, (2) typing statements could be easier for primary schoolchildren than generally thought, (3) there exist programming environments such as Processing that are easy to use and produce very attractive graphical outcomes. Teaching material for programming education with Processing was developed. In this teaching material, cartoons were used to explain difficult concepts. The learners who use this teaching material were supposed to draw some computational figures with chosen colors. Trial experiments of programming education using this teaching material was conducted to a cohort of seven primary schoolchildren (six 4th grade and one 5th grade children) in two consecutive weekend classes (one hour each). Since the authors aim of this programming education was to create a sense of fun and excitement in the children and inculcate a desire to engage with computing, the motivation of the children was assessed using the questionnaire based on the ARCS (Attention, Relevance, Confidence, and Satisfaction) motivation model. The results were encouraging and suggested that TPLs could be used in programming education for primary schoolchildren.

annual acis international conference on computer and information science | 2016

A fuzzy hashing technique for large scale software birthmarks

Takehiro Tsuzaki; Teruaki Yamamoto; Haruaki Tamada; Akito Monden

Software birthmarks have been proposed as a method for enabling the detection of programs that may have been stolen by measuring the similarity between the two programs. A birthmark is created from each program by extracting its native characteristics. The birthmarks of the programs can then be compared. However, because the extracted birthmarks contain a large amount of information, a large amount of time is needed when using them to compare large programs. This paper describes our work to reduce this comparison time. Achieving faster comparisons will enable the evaluation of large programs and simplify the use of birthmarks. Specifically, our method creates hashes from conventional birthmark information using fuzzy hashing, and then measures the similarity of the programs using the obtained hash values. Using the proposed method, we achieved a major speed increase over the conventional birthmark method with distinction rates of over 90%. On the other hand, because preservation performance decreased substantially, the similarity threshold value needed to be lowered when using the proposed method.

International Journal of Software Innovation (IJSI) | 2017

Scaling Up Software Birthmarks Using Fuzzy Hashing

Takehiro Tsuzaki; Teruaki Yamamoto; Haruaki Tamada; Akito Monden

Akito Monden Okayama University, Japan ABSTRACT To detect the software theft, software birthmarks have been proposed. Software birthmark systems extract software birthmarks, which are native characteristics of software, from binary programs, and compare them by computing the similarity between birthmarks. This paper proposes a new procedure for scaling up the birthmark systems. While conventional birthmark systems are composed of the birthmark extraction phase and the birthmark comparison phase, the proposed method adds two new phases between extraction and comparison, namely, compression phase, which employs fuzzy hashing, and pre-comparison phase, which aims to increase distinction property of birthmarks. The proposed method enables us to reduce the required time in the comparison phase, so that it can be applied to detect software theft among many larger scale software products. From an experimental evaluation, we found that the proposed method significantly reduces the comparison time, and keeps the distinction performance, which is one of the important properties of the birthmark. Also, the preservation performance is acceptable when the threshold value is properly set.

annual acis international conference on computer and information science | 2016

Identifying recurring association rules in software defect prediction

Takashi Watanabe; Akito Monden; Yasutaka Kamei; Shuji Morisaki

Association rule mining discovers patterns of co-occurrences of attributes as association rules in a data set. The derived association rules are expected to be recurrent, that is, the patterns recur in future in other data sets. This paper defines the recurrence of a rule, and aims to find a criteria to distinguish between high recurrent rules and low recurrent ones using a data set for software defect prediction. An experiment with the Eclipse Mylyn defect data set showed that rules of lower than 30 transactions showed low recurrence. We also found that the lower bound of transactions to select high recurrence rules is dependent on the required precision of defect prediction.

Proceedings of the 5th Program Protection and Reverse Engineering Workshop on | 2015

Pinpointing and Hiding Surprising Fragments in an Obfuscated Program

Yuichiro Kanzaki; Clark D. Thomborson; Akito Monden; Christian S. Collberg

In this paper, we propose a pinpoint-hide defense method, which aims to improve the stealth of obfuscated code. In the pinpointing process, we scan the obfuscated code in a few small code fragment level and identify all surprising fragments, that is, very unusual fragments which may draw the attention of an attacker to the obfuscated code. In the hiding process, we transform the pinpointed surprising fragments into unsurprising ones while preserving semantics. The obfuscated code transformed by our method consists only by unsurprising code fragments, therefore is more difficult for attackers to be distinguished from unobfuscated code than the original. In the case study, we apply our pinpoint-hide method to some programs transformed by well-known obfuscation techniques. The result shows our method can pinpoint surprising fragments such as dummy code that does not fit in the context of the program, and instructions used in a complicated arithmetic expression. We also confirm that instruction camouflage can make the pinpointed surprising fragments unsurprising ones, and that it runs correctly.

Explore More