Keisuke Hotta | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Keisuke Hotta is active.

Explore More

Publication

Featured researches published by Keisuke Hotta.

international workshop on principles of software evolution | 2010

Is duplicate code more frequently modified than non-duplicate code in software evolution?: an empirical study on open source software

Keisuke Hotta; Yukiko Sano; Yoshiki Higo; Shinji Kusumoto

Various kinds of research efforts have been performed on the basis that the presence of duplicate code has a negative impact on software evolution. A typical example is that, if we modify a code fragment that has been duplicated to other code fragments, it is necessary to consider whether the other code fragments have to be modified simultaneously or not. In this research, in order to investigate how much the presence of duplicate code is related to software evolution, we defined a new indicator, modification frequency. The indicator is a quantitative measure, and it allows us to objectively compare the maintainability of duplicate code and non-duplicate code. We conducted an experiment on 15 open source software systems, and the result showed that the presence of duplicate code does not have a negative impact on software evolution.

conference on software maintenance and reengineering | 2012

Identifying, Tailoring, and Suggesting Form Template Method Refactoring Opportunities with Program Dependence Graph

Keisuke Hotta; Yoshiki Higo; Shinji Kusumoto

Many research efforts have been performed on removing code clones. Especially, it is highly expected that clone removal techniques by applying Form Template Method have high applicability because they can be applied to code clones that have some gaps. Consequently some researchers have proposed techniques to support refactoring with Form Template Method. However, previous research efforts still have some issues. In this paper, we propose a new technique with program dependence graph to resolve these issues. By using program dependence graph, we can handle trivial differences that are unrelated to behavior of a program. Consequently the proposed method can suggest more appropriate removal candidates than previously proposed techniques.

working conference on reverse engineering | 2012

Inter-Project Functional Clone Detection Toward Building Libraries - An Empirical Study on 13,000 Projects

Tomoya Ishihara; Keisuke Hotta; Yoshiki Higo; Hiroshi Igaki; Shinji Kusumoto

Libraries created from commonly used functionalities offer a variety of benefits to developers. To locate such widely used functionalities, clone detection on a large corpus of source code could be useful. However, existing clone detection techniques did not address the creation of libraries. Therefore, existing clone detectors are sometimes unbefitting to detect candidates to be included in libraries. This paper proposes a method-based clone detection technique focusing on building libraries. This method-level granularity is appropriate for building libraries because a method composes a functionally coherent unit, and so it can be easily pulled up into libraries. Also, such a granularity realizes a scalable detection on huge data sets. Our experimental results on a huge data set (360 million lines of code, 13,000 projects) showed that the proposed technique could detect functional clones which might be beneficial on the creation of libraries within a short time frame.

international conference on program comprehension | 2014

Hey! are you committing tangled changes?

Hiroyuki Kirinuki; Yoshiki Higo; Keisuke Hotta; Shinji Kusumoto

Although there is a principle that states a commit should only include changes for a single task, it is not always respected by developers. This means that code repositories often include commits that contain tangled changes. The presence of such tangled changes hinders analyzing code repositories because most mining software repository (MSR) approaches are designed with the assumption that every commit includes only changes for a single task. In this paper, we propose a technique to inform developers that they are in the process of committing tangled changes. The proposed technique utilizes the changes included in the past commits to judge whether a given commit includes tangled changes. If it determines that the proposed commit may include tangled changes, it offers suggestions on how the tangled changes can be split into a set of untangled changes.

international conference on program comprehension | 2013

Gapped code clone detection with lightweight source code analysis

Hiroaki Murakami; Keisuke Hotta; Yoshiki Higo; Hiroshi Igaki; Shinji Kusumoto

A variety of methods detecting code clones has been proposed before. In order to detect gapped code clones, AST-based technique, PDG-based technique, metric-based technique and text-based technique using the LCS algorithm have been proposed. However, each of those techniques has limitations. For example, existing AST-based techniques and PDG-based techniques require costs for transforming source files into intermediate representations such as ASTs or PDGs and comparing them. Existing metric-based techniques and text-based techniques using the LCS algorithm cannot detect code clones if methods or blocks are partially duplicated. This paper proposes a new method that detects gapped code clones using the Smith-Waterman algorithm to resolve those limitations. The Smith-Waterman algorithm is an algorithm for identifying similar alignments between two sequences even if they include some gaps. The authors developed the proposed method as a software tool named CDSW, and confirmed that the proposed method could resolve the limitations by conducting a quantitative evaluation with Bellons benchmark.

conference on software maintenance and reengineering | 2014

Does return null matter

Shuhei Kimura; Keisuke Hotta; Yoshiki Higo; Hiroshi Igaki; Shinji Kusumoto

Developers often use null references for the returned values of methods (return null) in object-oriented languages. Although developers often use return null to indicate that a program does not satisfy some necessary conditions, it is generally felt that a method returning null is costly to maintain. One of the reasons for is that when a method receives a value returned from a method invocation whose code includes return null, it is necessary to check whether the returned value is null or not (null check). As developers often forget to write null checks, null dereferences occur frequently. However, it has not been clarified to what degree return null affects software maintenance during software evolution. This paper shows the influences of return null by investigating return null and null check in the evolution of source code. Experiments conducted on 14 open source projects showed that developers modify return null more frequently than return statements that do not include null. This result indicates that return null has a negative effect on software maintenance. It was also found that the size and the development phases of projects have no effect on the frequency of modifications on return null and null check. In addition, we found that all the projects in this experiment had from one to four null checks per 100 lines.

Electronic Communication of The European Association of Software Science and Technology | 2014

How Accurate Is Coarse-grained Clone Detection?: Comparision with Fine-grained Detectors

Keisuke Hotta; Jiachen Yang; Yoshiki Higo; Shinji Kusumoto

Research on clone detection has been quite successful over the past two decades, which produced a number of state-of-the-art clone detectors. However, it has been still challenging to detect clones, even with such successful detectors, across multiple projects or on thousands of revisions of code in limited time. A simple and coarse-grained detector will be an alternative of detectors using fine- grained analysis. It will drastically reduce time required for detection although it may miss some of clones that fine-grained detectors can detect. Hence, it should be adequate for a tentative analysis of clones if it has an acceptable accuracy. However, it is not clear how accurate such a coarse-grained approach is. This paper evaluates the accuracy of a coarse-grained clone detector compared with some fine-grained clone detectors. Our experiment provides an empirical evidence about acceptable accuracy of such a coarse-grained approach. Thus, we conclude that coarse-grained detection is adequate to make a summary of clone analysis and to be a starter of detailed analysis including manual inspections and bug detection.

source code analysis and manipulation | 2012

Folding Repeated Instructions for Improving Token-Based Code Clone Detection

Hiroaki Murakami; Keisuke Hotta; Yoshiki Higo; Hiroshi Igaki; Shinji Kusumoto

A variety of code clone detection methods have been proposed before now. However, only a small part of them is widely used. Widely-used methods are line-based and token-based ones. They have high scalability because they neither require deep source code analysis nor constructing complex intermediate structures for the detection. High scalability is one of the big advantages in code clone detection tools. On the other hand, line/token-based detections yield many false positives. One of the factors is the presence of repeated instructions in the source code. For example, herein we assume that there are consecutive three printf statements in C source code. If we apply a token-based detection to them, the former two statements are detected as a code clone of the latter two statements. However, such overlapped code clones are redundant and so not useful for developers. In this paper, we propose a new detection method that is free from the influence of the presence of repeated instructions. The proposed method transforms every of repeated instructions into a special form, and then it detects code clones using a suffix array algorithm. The transformation prevents many false positives from being detected. Also, the detection speed remains. The proposed detection method has already been developed as a software tool, FRISC. We confirmed the usefulness of the proposed method by conducting a quantitative evaluation of FRISC with Bellons oracle.

international workshop on principles of software evolution | 2013

Enhancement of CRD-based clone tracking

Yoshiki Higo; Keisuke Hotta; Shinji Kusumoto

Many researchers have conducted a variety of research related to clone evolution. In order to grasp how clones have evolved, clones must be tracked. However, conventional clone tracking techniques are not feasible to track clones if they moved to another location in the source code. Consequently, in this research, we propose a new clone tracking technique. The proposed technique is an enhanced version of clone tracking with clone region descriptor (CRD) proposed by Duala-Ekoko and Robillard. The proposed technique can track clones even if they moved to another location. We have implemented a software tool based on the proposed technique, and applied it to two open source systems. In the experiment, we confirmed that the proposed technique could track 44 clone groups, which the conventional CRD tracking could not track. The accuracy of the tracking for those clones was 91%.

asia-pacific software engineering conference | 2013

How Much Do Code Repositories Include Peripheral Modifications

Noa Kusunoki; Keisuke Hotta; Yoshiki Higo; Shinji Kusumoto

In the last decade, a variety of studies on mining software repositories has been conducted. Mining repositories has a potential to obtain useful knowledge for the future development and maintenance. When software repositories are mined, large commits in them are often excluded from mining targets because large commits include merging and we believe that large commits include peripheral modifications, which may affect negative impacts on mining code repositories. However, if large commits include code modifications, excluding large commits loses such modifications unintentionally. Moreover, such data cleansing assumes that there are no peripheral modifications in small commits. In this paper, we investigate how much peripheral modifications are included in commits in code repositories. As a result, we found that excluding large commits is insufficient to remove hindrances in commits for mining code repositories.

Explore More