Michael L. Collard | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael L. Collard is active.

Explore More

Publication

Featured researches published by Michael L. Collard.

workshop on program comprehension | 2003

An XML-based lightweight C++ fact extractor

Michael L. Collard; Huzefa H. Kagdi; Jonathan I. Maletic

A lightweight fact extractor is presented that utilizes XML tools, such as XPath and XSLT to extract static information from C++ source code programs. The source code is first converted into an XML representation, srcML, to facilitate the use of a wide variety of XML tools. The method is deemed lightweight because only a partial parsing of the source is done. Additionally, the technique is quite robust and can be applied to incomplete and noncompilable source code. The trade off to this approach is that queries on some low level details cannot be directly addressed. This approach is applied to a fact extractor benchmark as comparison with other, heavier weight, fact extractors. Fact extractors are widely used to support understanding tasks associated with maintenance, reverse engineering and various other software engineering tasks.

visualizing software for understanding and analysis | 2002

A task oriented view of software visualization

Jonathan I. Maletic; Andrian Marcus; Michael L. Collard

A number of taxonomies to classify and categorize software visualization systems have been proposed in the past. Most notable are those presented by Price (1993) and Roman (1993). While these taxonomies are an accurate representation of software visualization issues, they are somewhat skewed with respect to current research areas on software visualization. We revisit this important work and propose a number of re-alignments with respect to addressing the software engineering tasks of large-scale development and maintenance. We propose a framework to emphasize the general tasks of understanding and analysis during development and maintenance of large-scale software systems. Five dimensions relating to the what, where, how, who, and why of software visualization make up this framework. The focus of this work is not so much as to classify software visualization system, but to point out the need for matching the method with the task. Finally, a number of software visualization systems are examined under our framework to highlight the particular problems each addresses.

workshop on program comprehension | 2002

Source code files as structured documents

Jonathan I. Maletic; Michael L. Collard; Andrian Marcus

A means to add explicit structure to program source code is presented. XML is used to augment source code with syntactic information from the parse tree. More importantly, comments and formatting are preserved and identified for future use by development environments and program comprehension tools. The focus is to construct a document representation in XML instead of a more traditional data representation of the source code. This type of representation supports a programmer centric view of the source rather than a compiler centric view. Our representation is made relevant with respect to other research on XML representations of parse trees and program code. The highlights of the representation are presented and the use of queries and transformations discussed.

working conference on reverse engineering | 2010

Blending Conceptual and Evolutionary Couplings to Support Change Impact Analysis in Source Code

Huzefa H. Kagdi; Malcom Gethers; Denys Poshyvanyk; Michael L. Collard

The paper presents an approach that combines conceptual and evolutionary techniques to support change impact analysis in source code. Information Retrieval (IR) is used to derive conceptual couplings from the source code in a single version (release) of a software system. Evolutionary couplings are mined from source code commits. The premise is that such combined methods provide improvements to the accuracy of impact sets. A rigorous empirical assessment on the changes of the open source systems Apache httpd, ArgoUML, iBatis, and KOffice is also reported. The results show that a combination of these two techniques, across several cut points, provides statistically significant improvements in accuracy over either of the two techniques used independently. Improvements in recall values of up to 20% over the conceptual technique in KOffice and up to 45% over the evolutionary technique in iBatis were reported.

document engineering | 2002

Supporting document and data views of source code

Michael L. Collard; Jonathan I. Maletic; Andrian Marcus

The paper describes the use of an XML format to store and represent program source code. A new XML application, srcML (SouRCe Markup Language), is presented. srcML presumes a document view of source code where information about the syntactic structure is layered over the original source code document. The resultant multi-layered document has a base layer of all the original text (and formatting). The second layer is the syntactic information, derived from the grammar of the programming language, and is encoded in XML. This multi-layered view supports both the creation and viewing of the source code in its original form and the use of XML technologies (for tasks such as analysis and transformation of the source). Although directed at source code documents, (particularly C++) srcML is also applicable to other programming languages and to languages with a strict syntax. srcML represents a departure from the compiler centric manner in which source code is commonly stored, instead a document point of view is taken thus better supporting the manipulation and management of the large numbers of source documents typical in modern software systems.

international conference on software maintenance | 2004

Supporting source code difference analysis

Jonathan I. Maletic; Michael L. Collard

The paper describes an approach to easily conduct analysis of source-code differences. The approach is termed meta-differencing to reflect the fact that additional knowledge of the differences can be automatically derived. Meta-differencing is supported by an underlying source-code representation developed by the authors. The representation, srcML, is an XML format that explicitly embeds abstract syntax within the source code while preserving the documentary structure as dictated by the developer. XML tools are leveraged together with standard differencing utilities (i.e., diff,) to generate a meta-difference. The meta-difference is also represented in an XML format called srcDiff. The meta-difference contains specific syntactic information regarding the source-code changes. In turn this can be queried and searched with XML tools for the purpose of extracting information about the specifics of the changes. A case study of using the meta-differencing approach on an open-source system is presented to demonstrate its usefulness and validity.

source code analysis and manipulation | 2011

Lightweight Transformation and Fact Extraction with the srcML Toolkit

Michael L. Collard; Michael John Decker; Jonathan I. Maletic

The srcML toolkit for lightweight transformation and fact-extraction of source code is described. srcML is an XML format for C/C++/Java source code. The open source toolkit that includes the source-to-srcML and srcML-to-source translators for round-trip reverse engineering is freely available. The direct use of XPath and XSLT is supported, an archive format for large projects is included, and a rich set of input and output formats through a command-line interface is available. Applying transformations and formulating queries using srcML is very convenient. Application use-cases of transformations and fact-extraction are shown and demonstrated to be practical and scalable.

international conference on software maintenance | 2010

Automatic identification of class stereotypes

Natalia Dragan; Michael L. Collard; Jonathan I. Maletic

An approach is presented to automatically determine a classs stereotype. The stereotype is based on the frequency and distribution of method stereotypes in the class. Method stereotypes are automatically determined using a defined taxonomy given in previous work. The stereotypes, boundary, control and entity are used as a basis but refined based on an empirical investigation of 21 systems. A number of heuristics, derived from empirical evidence, are used to determine a classs stereotype. For example, the prominence of certain types of methods can indicate a classs main role. The approach is applied to five open source systems and evaluated. The results show that 95% of the classes are stereotyped by the approach. Additionally, developers (via manual inspection) agreed with the approachs results.

Proceedings of the 2009 ICSE Workshop on Traceability in Emerging Forms of Software Engineering | 2009

TQL: A query language to support traceability

Jonathan I. Maletic; Michael L. Collard

A query language for traceability is proposed and presented. The language, TQL, is based in XML and supports queries across multiple artifacts and multiple traceability link types. A number of primitives are defined to allow complex queries to be constructed and executed. Example queries are presented in the context of traceability questions. The technical details of the language and issues of implementation are discussed.

Software Quality Journal | 2011

Automatically identifying changes that impact code-to-design traceability during evolution

Maen Hammad; Michael L. Collard; Jonathan I. Maletic

An approach is presented that automatically determines if a given source code change impacts the design (i.e., UML class diagram) of the system. This allows code-to-design traceability to be consistently maintained as the source code evolves. The approach uses lightweight analysis and syntactic differencing of the source code changes to determine if the change alters the class diagram in the context of abstract design. The intent is to support both the simultaneous updating of design documents with code changes and bringing old design documents up to date with current code given the change history. An efficient tool was developed to support the approach and is applied to an open source system. The results are evaluated and compared against manual inspection by human experts. The tool performs better than (error prone) manual inspection. The developed approach and tool were used to empirically investigate and understand how changes to source code (i.e., commits) break code-to-design traceability during evolution and the benefits from such understanding. Commits are categorized as design impact or no impact. The commits of four open source projects over 3-year time durations are extracted and analyzed. The results of the study show that most of the code changes do not impact the design and these commits have a smaller number of changed files and changed less lines compared to commits with design impact. The results also show that most bug fixes do not impact design.

Explore More