Sunny Wong | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sunny Wong is active.

Explore More

Publication

Featured researches published by Sunny Wong.

international conference on software engineering | 2011

Detecting software modularity violations

Sunny Wong; Yuanfang Cai; Miryung Kim; Michael Dalton

This paper presents Clio, an approach that detects modularity violations, which can cause software defects, modularity decay, or expensive refactorings. Clio computes the discrepancies between how components should change together based on the modular structure, and how components actually change together as revealed in version history. We evaluated Clio using 15 releases of Hadoop Common and 10 releases of Eclipse JDT. The results show that hundreds of violations identified using Clio were indeed recognized as design problems or refactored by the developers in later versions. The identified violations exhibit multiple symptoms of poor design, some of which are not easily detectable using existing approaches.

Software Quality Journal | 2014

Comparing four approaches for technical debt identification

Nico Zazworka; Antonio Vetro; Clemente Izurieta; Sunny Wong; Yuanfang Cai; Carolyn B. Seaman; Forrest Shull

Software systems accumulate technical debt (TD) when short-term goals in software development are traded for long-term goals (e.g., quick-and-dirty implementation to reach a release date versus a well-refactored implementation that supports the long-term health of the project). Some forms of TD accumulate over time in the form of source code that is difficult to work with and exhibits a variety of anomalies. A number of source code analysis techniques and tools have been proposed to potentially identify the code-level debt accumulated in a system. What has not yet been studied is if using multiple tools to detect TD can lead to benefits, that is, if different tools will flag the same or different source code components. Further, these techniques also lack investigation into the symptoms of TD “interest” that they lead to. To address this latter question, we also investigated whether TD, as identified by the source code analysis techniques, correlates with interest payments in the form of increased defect- and change-proneness. Comparing the results of different TD identification approaches to understand their commonalities and differences and to evaluate their relationship to indicators of future TD “interest.” We selected four different TD identification techniques (code smells, automatic static analysis issues, grime buildup, and Modularity violations) and applied them to 13 versions of the Apache Hadoop open source software project. We collected and aggregated statistical measures to investigate whether the different techniques identified TD indicators in the same or different classes and whether those classes in turn exhibited high interest (in the form of a large number of defects and higher change-proneness). The outputs of the four approaches have very little overlap and are therefore pointing to different problems in the source code. Dispersed Coupling and Modularity violations were co-located in classes with higher defect-proneness. We also observed a strong relationship between Modularity violations and change-proneness. Our main contribution is an initial overview of the TD landscape, showing that different TD techniques are loosely coupled and therefore indicate problems in different locations of the source code. Moreover, our proxy interest indicators (change- and defect-proneness) correlate with only a small subset of TD indicators.

automated software engineering | 2009

Design Rule Hierarchies and Parallelism in Software Development Tasks

Sunny Wong; Yuanfang Cai; Giuseppe Valetto; Georgi Simeonov; Kanwarpreet Sethi

As software projects continue to grow in scale, being able to maximize the work that developers can carry out in parallel as a set of concurrent development tasks, without incurring excessive coordination overhead, becomes increasingly important. Prevailing design models, however, are not explicitly conceived to suggest how development tasks on the software modules they describe can be effectively parallelized. In this paper, we present a design rule hierarchy based on the assumption relations among design decisions. Software modules located within the same layer of the hierarchy suggest independent, hence parallelizable, tasks. Dependencies between layers or within a module suggest the need for coordination during concurrent work. We evaluate our approach by investigating the source code and mailing list of Apache Ant. We observe that technical communication between developers working on different modules within the same hierarchy layer, as predicted, is significantly less than communication between developers working across layers.

working ieee/ifip conference on software architecture | 2009

From retrospect to prospect: Assessing modularity and stability from software architecture

Kanwarpreet Sethi; Yuanfang Cai; Sunny Wong; Alessandro Garcia; Cláudio Sant'Anna

Architecture-level decisions, directly influenced by environmental factors, are crucial to preserve modularity and stability throughout software development life-cycle. Tradeoffs of modularization alternatives, such as aspect-oriented vs. object-oriented decompositions, thus need to be assessed from architecture models instead of source code. In this paper, we present a suite of architecture-level metrics, taking external factors that drive software changes into consideration and measuring how well an architecture produces independently substitutable modules. We formalize these metrics using logical models to automate quantitative stability and modularity assessment. We evaluate the metrics using eight aspect-oriented and object-oriented releases of a software product-line architecture, driven by a series of heterogeneous changes. By contrasting with an implementation-level analysis, we observe that these metrics can effectively reveal which modularization alternative generates more stable, modular design from high-level models.

quality of software architectures | 2013

Leveraging design rules to improve software architecture recovery

Yuanfang Cai; Hanfei Wang; Sunny Wong; Linzhang Wang

In order to recover software architecture, various clustering techniques have been created to automatically partition a software system into meaningful subsystems. While these techniques have demonstrated their effectiveness, we observe that a key feature within most software systems has not been fully exploited: most well-designed systems follow strong architectural design rules that split the overall system into modules. These design rules are often manifested as special program constructs, such as shared data structures or abstract interfaces, which should not belong to any of the subordinate modules. We contribute a new perspective of architecture recovery based on this rationale, which enables the combination of design-rule-based clustering with other clustering techniques, as well as enabling the splitting of a large system into subsystems. We evaluated our approach both quantitatively and qualitatively, using both open source and real industrial software projects.

international conference on software maintenance | 2009

Predicting change impact from logical models

Sunny Wong; Yuanfang Cai

To improve the ability of predicting the impact scope of a given change, we present two approaches applicable to the maintenance of object-oriented software systems. Our first approach exclusively uses a logical model extracted from UML relations among classes, and our other, hybrid approach additionally considers information mined from version histories. Using the open source Hadoop system, we evaluate our approaches by comparing our impact predictions with predictions generated using existing data mining techniques, and with actual change sets obtained from bug reports. We show that both our approaches produce better predictions when the system is immature and the version history is not well-established, and our hybrid approach produces comparable results with data mining as the system evolves.

automated software engineering | 2009

Improving the Efficiency of Dependency Analysis in Logical Decision Models

Sunny Wong; Yuanfang Cai

To address the problem that existing software dependency extraction methods do not work on higher-level software artifacts, do not express decisions explicitly, and do not reveal implicit or indirect dependencies, our recent work explored the possibility of formally defining and automatically deriving a pairwise dependence relation from an augmented constraint networks (ACN) that models the assumption relation among design decisions. The current approach is difficult to scale, requiring constraint solving and solution enumeration. We observe that the assumption relation among design decisions for most software systems can be abstractly modeled using a special form of ACN. For these more restrictive, but highly representative models, we present an O(n^3) algorithm to derive the dependency relation without solving the constraints. We evaluate our approach by computing design structure matrices for existing ACNs that model multiple versions of heterogenous real software designs, often reducing the running time from hours to seconds.

working conference on reverse engineering | 2010

Reverse Engineering Utility Functions Using Genetic Programming to Detect Anomalous Behavior in Software

Sunny Wong; Melissa Aaron; Jeffrey Segall; Kevin Lynch; Spiros Mancoridis

Recent studies have shown the promise of using utility functions to detect anomalous behavior in software systems at runtime. However, it remains a challenge for software engineers to hand-craft a utility function that achieves both a high precision (i.e., few false alarms) and a high recall (i.e., few undetected faults). This paper describes a technique that uses genetic programming to automatically evolve a utility function for a specific system, set of resource usage metrics, and precision/recall preference. These metrics are computed using sensor values that monitor a variety of system resources (e.g., memory usage, processor usage, thread count). The technique allows users to specify the relative importance of precision and recall, and builds a utility function to meet those requirements. We evaluated the technique on the open source Jigsaw web server using ten resource usage metrics and five anomalous behaviors in the form of injected faults in the Jigsaw code and a security attack. To assess the effectiveness of the technique, the precision and recall of the evolved utility function was compared to that of a hand-crafted utility function that uses a simple thresholding scheme. The results show that the evolved function outperformed the hand-crafted function by 10 percent.

conference on software engineering education and training | 2011

Leveraging design structure matrices in software design education

Yuanfang Cai; Daniel Iannuzzi; Sunny Wong

Important software design concepts, such as information hiding and separation of concerns, are often conveyed to students informally. The modularity and hence maintainability of student software is difficult to assess. In this paper, we report our study of using design structure matrix (DSM) to assess the modularity of student software by comparing the differences between the DSM representing the intended design and the DSMs representing the software implemented by the students. We applied this approach to a software design class at Drexel University. We found that even though the lab and homework assignments were of small scale, and in many cases, detailed designs were given to the students in the form of UML class diagrams, 74% of the 85 student submissions, although fulfilled the required functionality, introduced unexpected dependencies so that the modules that designed to be independent are actually coupled. These design problems can only be revealed during software evolution, which is usually not possible for student projects. The results show the necessity and benefits of applying DSM modeling to make such design problems explicit to the students.

automated software engineering | 2011

Generalizing evolutionary coupling with stochastic dependencies

Sunny Wong; Yuanfang Cai

Researchers have leveraged evolutionary coupling derived from revision history to conduct various software analyses, such as software change impact analysis (IA). The problem is that the validity of historical data depends on the recency of changes and varies with different evolution paths—thus, influencing the accuracy of analysis results. In this paper, we formalize evolutionary coupling as a stochastic process using a Markov chain model. By varying the parameters of this model, we define a family of stochastic dependencies that accounts for different types of evolution paths. Each member of this family weighs historical data differently according to their recency and frequency. To assess the utility of this model, we conduct IA on 78 releases of five open source systems, using 16 stochastic dependency types, and compare with the results of several existing approaches. The results show that our stochastic-based IA technique can provide more accurate results than these existing techniques.

Explore More