Hamid Abdul Basit
Lahore University of Management Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hamid Abdul Basit.
foundations of software engineering | 2005
Hamid Abdul Basit; Stan Jarzabek
Cloning in software systems is known to create problems during software maintenance. Several techniques have been proposed to detect the same or similar code fragments in software, so-called simple clones. While the knowledge of simple clones is useful, detecting design-level similarities in software could ease maintenance even further, and also help us identify reuse opportunities. We observed that recurring patterns of simple clones - so-called structural clones - often indicate the presence of interesting design-level similarities. An example would be patterns of collaborating classes or components. Finding structural clones that signify potentially useful design information requires efficient techniques to analyze the bulk of simple clone data and making non-trivial inferences based on the abstracted information. In this paper, we describe a practical solution to the problem of detecting some basic, but useful, types of design-level similarities such as groups of highly similar classes or files. First, we detect simple clones by applying conventional token-based techniques. Then we find the patterns of co-occurring clones in different files using the Frequent Itemset Mining (FIM) technique. Finally, we perform file clustering to detect those clusters of highly similar files that are likely to contribute to a design-level similarity pattern. The novelty of our approach is application of data mining techniques to detect design level similarities. Experiments confirmed that our method finds many useful structural clones and scales up to big programs. The paper describes our method for structural clone detection, a prototype tool called Clone Miner that implements the method and experimental results.
IEEE Transactions on Software Engineering | 2009
Hamid Abdul Basit; Stan Jarzabek
Code clones are similar program structures recurring in variant forms in software system(s). Several techniques have been proposed to detect similar code fragments in software, so-called simple clones. Identification and subsequent unification of simple clones is beneficial in software maintenance. Even further gains can be obtained by elevating the level of code clone analysis. We observed that recurring patterns of simple clones often indicate the presence of interesting higher-level similarities that we call structural clones. Structural clones show a bigger picture of similarity situation than simple clones alone. Being logical groups of simple clones, structural clones alleviate the problem of huge number of clones typically reported by simple clone detection tools, a problem that is often dealt with postdetection visualization techniques. Detection of structural clones can help in understanding the design of the system for better maintenance and in reengineering for reuse, among other uses. In this paper, we propose a technique to detect some useful types of structural clones. The novelty of our approach includes the formulation of the structural clone concept and the application of data mining techniques to detect these higher-level similarities. We describe a tool called clone miner that implements our proposed technique. We assess the usefulness and scalability of the proposed techniques via several case studies. We discuss various usage scenarios to demonstrate in what ways the knowledge of structural clones adds value to the analysis based on simple clones alone.
foundations of software engineering | 2007
Hamid Abdul Basit; Simon J. Puglisi; William F. Smyth; Andrew Turpin; Stan Jarzabek
Code clones are similar code fragments that occur at multiple locations in a software system. Detection of code clones provides useful information for maintenance, reengineering, program understanding and reuse. Several techniques have been proposed to detect code clones. These techniques differ in the code representation used for analysis of clones, ranging from plain text to parse trees and program dependence graphs. Clone detection based on lexical tokens involves minimal code transformation and gives good results, but is computationally expensive because of the large number of tokens that need to be compared. We explored string algorithms to find suitable data structures and algorithms for efficient token based clone detection and implemented them in our tool Repeated Tokens Finder (RTF). Instead of using suffix tree for string matching, we use more memory efficient suffix array. RTF incorporates a suffix array based linear time algorithm to detect string matches. It also provides a simple and customizable tokenization mechanism. Initial analysis and experiments show that our clone detection is simple, scalable, and performs better than the previous well-known tools.
international conference on software engineering | 2005
Hamid Abdul Basit; Damith C. Rajapakse; Stan Jarzabek
Templates (or generics) help us write compact, generic code, which aids both reuse and maintenance. The STL is a powerful example of how templates help achieve these goals. Still, our study of the STL revealed substantial, and in our opinion, counter-productive repetitions (so-called clones) across groups of similar class or function templates. Clones occurred, as variations across these similar program structures were irregular and could not be unified by suitable template parameters in a natural way. We encountered similar problems in other class libraries as well as in application programs, written in a range of programming languages. In the paper, we present quantitative and qualitative results from our study. We argue that the difficulties we encountered affect programs in general. We present a solution that can treat such template-unfriendly cases of redundancies at the meta-level, complementing and extending the power of language features, such as templates, in areas of generic programming.
international conference on software maintenance | 2008
Yali Zhang; Hamid Abdul Basit; Stan Jarzabek; Dang Anh; Melvin Low
Code clones are similar program structures recurring in software systems. Clone detectors produce much information and a challenge is to identify useful clones depending on the goals of clone analysis. To do so, further abstraction, filtering and visualization of cloning information, with the involvement of a human expert, is required. In this paper, we describe a technique for filtering and visualization of cloning information generated by Clone Miner, a clone detection tool presented in our earlier work. Unique benefit and contribution of our approach is that a human expert can define a wide range of filters to extract abstract views of the cloning data using a clone-query system to suit specific needs of clone analysis. We then produce standardized graphical presentations of those views for various types of clone queries. We implemented the technique into an Eclipse plug-in called Clone Visualizer. Clone Visualizer works closely with Clone Miner which not only finds similar code fragments (simple clones) but also finds higher-level abstractions of the cloning information. Our method is the first attempt to address filtering and visualization of those higher level cloning abstractions. We illustrate application of our technique with examples from a clone analysis project with Clone Miner and Clone Visualizer.
IEEE Transactions on Reliability | 2006
Ling Yuan; Jin Song Dong; Jing Sun; Hamid Abdul Basit
This paper proposes a novel heterogeneous software architecture GFTSA (Generic Fault Tolerant Software Architecture) which can guide the development of safety critical distributed systems. GFTSA incorporates an idealized fault tolerant component concept, and coordinated error recovery mechanism in the early system design phase. It can be reused in the high level model design of specific safety critical distributed systems with reliability requirements. To provide precise common idioms & patterns for the system designers, formal language Object-Z is used to specify GFTSA. Formal proofs based on Object-Z reasoning rules are constructed to demonstrate that the proposed GFTSA model can preserve significant fault tolerant properties. The inheritance & instantiation mechanisms of Object-Z can contribute to the customization of the GFTSA formal model. By analyzing the customization process, we also present a template of GFTSA, expressed in x-frames using the XVCL (XML-based Variant Configuration Language) methodology to make the customization process more direct & automatic. We use an LDAS (Line Direction Agreement System) case study to illustrate that GFTSA can guide the development of specific safety critical distributed systems
international conference on software maintenance | 2012
Hamid Abdul Basit; Usman Ali; Sidra Haque; Stan Jarzabek
In previous work, we described a technique for detecting design-level similar program structures (structural clones) formed from recurring configurations of similar code fragments (simple clones). In this paper, we analyze in detail how frequently these structural clones occur in software systems and how structural clone analysis extends the benefits of analysis based on simple clones only. Our case study of 11 open source systems revealed that over 50% of simple clones are captured by structural clones that often correspond to meaningful design or application domain concepts. Because of their larger size, it is easier for programmers to perceive the similarity situation in a system from structural clone perspective rather than from simple clone perspective only. We also discuss the contribution of structural clone detection towards program understanding, design recovery, maintenance, and refactoring using examples from the case study systems.
software visualization | 2015
Hamid Abdul Basit; Muhammad Hammad; Rainer Koschke
Comprehending software clones is necessary for a number of activities in software development. The comprehension of software clones is challenged by the sheer volume of data and the complexity of the information content in that data. Visualization, or visual data analysis, takes advantage of human cognitive skills to discover unstructured insights from the visual presentations of complex and voluminous data. In this paper, we survey the existing literature on visualization of software clones. We gather the insights provided, and put that information in context of actual information needs systematically derived from the clone management goals. This framework allows us to better understand the role a visualization may play in achieving a specific user goal, identify potential gaps between existing types of visualization and information needs, and find complementary non-redundant subsets of visualizations for each user goal.
international workshop on software clones | 2015
Hamid Abdul Basit; Muhammad Hammad; Stan Jarzabek; Rainer Koschke
Clone detection can be used to achieve diverse objectives such as refactoring, program understanding, bug localization, and plagiarism detection, etc. Each goal takes a different perspective on clone information needs. Different clone detection tools report different information about clones. To gauge the suitability of a given clone detector for a particular user objective, we need to determine which information needs implied by the objective a clone detector addresses. In this paper, we make a first step toward gathering clone information needs from the description of user goals. The results of our analysis are useful for various stakeholders such as programmers, managers, tool developers, and researchers.
Programming and Computer Software | 2016
Dmitry Luciv; Dmitrij Koznov; Hamid Abdul Basit; Andrey N. Terekhov
Increasing complexity of software documentation calls for additional requirements of document maintenance. Documentation reuse can make a considerable contribution to solve this problem. This paper presents a method for fuzzy repetitions search in software documentation that is based on software clone detection. The search results are used for document refactoring. This paper also presents Documentation Refactoring Toolkit implementing the proposed method and integrated with the DocLine project. The proposed approach is evaluated on documentation packages for a number of open-source projects: Linux Kernel, Zend Framework, Subversion, and DocBook.