Mark D. Syer | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mark D. Syer is active.

Explore More

Publication

Featured researches published by Mark D. Syer.

source code analysis and manipulation | 2011

Exploring the Development of Micro-apps: A Case Study on the BlackBerry and Android Platforms

Mark D. Syer; Bram Adams; Ying Zou; Ahmed E. Hassan

The recent meteoric rise in the use of smart phones and other mobile devices has led to a new class of applications, i.e., micro-apps, that are designed to run on devices with limited processing, memory, storage and display resources. Given the rapid succession of mobile technologies and the fierce competition, micro-app vendors need to release new features at break-neck speed, without sacrificing product quality. To understand how different mobile platforms enable such a rapid turnaround-time, this paper compares three pairs of feature-equivalent Android and Blackberry micro-apps. We do this by analyzing the micro-apps along the dimensions of source code, code dependencies and code churn. BlackBerry micro-apps are much larger and rely more on third party libraries. However, they are less susceptible to platform changes since they rely less on the underlying platform. On the other hand, Android micro-apps tend to concentrate code into fewer files and rely heavily on the Android platform. On both platforms, code churn of micro-apps is very high.

Software Quality Journal | 2015

Studying the relationship between source code quality and mobile platform dependence

Mark D. Syer; Meiyappan Nagappan; Bram Adams; Ahmed E. Hassan

Abstract The recent meteoric rise in the use of smartphones and other mobile devices has led to a new class of software applications (i.e., mobile apps). One reason for this success is the extensive support available to mobile app developers through the APIs provided by mobile platforms (e.g., Android). In our previous research, we found that mobile apps tend to depend highly on these platform-specific APIs. High dependence on a particular mobile platform may introduce instability and defects, as these mobile platforms are rapidly evolving. Therefore, the extent of platform dependence may be an indicator of software quality. In this paper, we examine the relationship between platform dependence and defect proneness of the source code files of an Android app to determine whether software metrics based on platform dependence can be used to prioritize software quality assurance efforts. We find that (1) source code files that are defect prone have a higher dependence on the platform than defect-free files and (2) increasing the platform dependence increases the likelihood of a defect being present in a source code file. Thus, platform dependence may be used to prioritize the most defect-prone source code files for code reviews and unit testing by the software quality assurance team.

international conference on performance engineering | 2014

Continuous validation of load test suites

Mark D. Syer; Zhen Ming Jiang; Meiyappan Nagappan; Ahmed E. Hassan; Mohamed N. Nasser; Parminder Flora

Ultra-Large-Scale (ULS) systems face continuously evolving field workloads in terms of activated/disabled feature sets, varying usage patterns and changing deployment configurations. These evolving workloads often have a large impact, on the performance of a ULS system. Hence, continuous load testing is critical to ensuring the error-free operation of such systems. A common challenge facing performance analysts is to validate if a load test closely resembles the current field workloads. Such validation may be performed by comparing execution logs from the load test and the field. However, the size and unstructured nature of execution logs makes such a comparison unfeasible without automated support. In this paper, we propose an automated approach to validate whether a load test resembles the field workload and, if not, determines how they differ by compare execution logs from a load test and the field. Performance analysts can then update their load test cases to eliminate such differences, hence creating more realistic load test cases. We perform three case studies on two large systems: one open-source system and one enterprise system. Our approach identifies differences between load tests and the field with a precision of >75% compared to only >16% for the state-of-the-practice.

international conference on software maintenance | 2013

Leveraging Performance Counters and Execution Logs to Diagnose Memory-Related Performance Issues

Mark D. Syer; Zhen Ming Jiang; Meiyappan Nagappan; Ahmed E. Hassan; Mohamed N. Nasser; Parminder Flora

Load tests ensure that software systems are able to perform under the expected workloads. The current state of load test analysis requires significant manual review of performance counters and execution logs, and a high degree of system-specific expertise. In particular, memory-related issues (e.g., memory leaks or spikes), which may degrade performance and cause crashes, are difficult to diagnose. Performance analysts must correlate hundreds of megabytes or gigabytes of performance counters (to understand resource usage) with execution logs (to understand system behaviour). However, little work has been done to combine these two types of information to assist performance analysts in their diagnosis. We propose an automated approach that combines performance counters and execution logs to diagnose memory-related issues in load tests. We perform three case studies on two systems: one open-source system and one large-scale enterprise system. Our approach flags ≤ 0.1% of the execution logs with a precision ≥ 80%.

international conference on software maintenance | 2011

Identifying performance deviations in thread pools

Mark D. Syer; Bram Adams; Ahmed E. Hassan

Large-scale software systems handle increasingly larger workloads by implementing highly concurrent and distributed design patterns. The thread pool pattern uses pools of pre-existing and reusable threads to limit thread lifecycle over-head (thread creation and destruction) and resource thrashing (thread proliferation). However, these advantages are weighed against performance issues caused by concurrency risks, like synchronization errors or deadlock, and thread pool-specific risks, like poorly tuned pool size or thread leakage. Detecting these performance issues during load testing requires a thorough understanding of how thread pools behave, yet most performance analysts have limited knowledge of the system and are flooded with terabytes of data from load tests. We propose a methodology to identify threads with performance deviations in thread pools. Our methodology ranks threads based on the dissimilarity of their resource usage metrics. A case study on a large-scale industrial software system shows that our methodology can identify threads with performance deviations with an average precision of 100% and an average recall of 76.61%. Our methodology performs very well when ranking long-lived deviations, such as memory leaks, but more work is needed to rank short-lived deviations, such as CPU spikes.

IEEE Transactions on Software Engineering | 2015

Replicating and Re-Evaluating the Theory of Relative Defect-Proneness

Mark D. Syer; Meiyappan Nagappan; Bram Adams; Ahmed E. Hassan

A good understanding of the factors impacting defects in software systems is essential for software practitioners, because it helps them prioritize quality improvement efforts (e.g., testing and code reviews). Defect prediction models are typically built using classification or regression analysis on product and/or process metrics collected at a single point in time (e.g., a release date). However, current defect prediction models only predict if a defect will occur, but not when, which makes the prioritization of software quality improvements efforts difficult. To address this problem, Koru et al. applied survival analysis techniques to a large number of software systems to study how size (i.e., lines of code) influences the probability that a source code module (e.g., class or file) will experience a defect at any given time. Given that 1) the work of Koru et al. has been instrumental to our understanding of the size-defect relationship, 2) the use of survival analysis in the context of defect modelling has not been well studied and 3) replication studies are an important component of balanced scholarly debate, we present a replication study of the work by Koru et al. In particular, we present the details necessary to use survival analysis in the context of defect modelling (such details were missing from the original paper by Koru et al.). We also explore how differences between the traditional domains of survival analysis (i.e., medicine and epidemiology) and defect modelling impact our understanding of the size-defect relationship. Practitioners and researchers considering the use of survival analysis should be aware of the implications of our findings.

Empirical Software Engineering | 2017

A study of the relation of mobile device attributes with the user-perceived quality of Android apps

Ehsan Noei; Mark D. Syer; Ying Zou; Ahmed E. Hassan; Iman Keivanloo

The number of mobile applications (apps) and mobile devices has increased considerably over the past few years. Online app markets, such as the Google Play Store, use a star-rating mechanism to quantify the user-perceived quality of mobile apps. Users may rate apps on a five point (star) scale where a five star-rating is the highest rating. Having considered the importance of a high star-rating to the success of an app, recent studies continue to explore the relationship between the app attributes, such as User Interface (UI) complexity, and the user-perceived quality. However, the user-perceived quality reflects the users’ experience using an app on a particular mobile device. Hence, the user-perceived quality of an app is not solely determined by app attributes. In this paper, we study the relation of both device attributes and app attributes with the user-perceived quality of Android apps from the Google Play Store. We study 20 device attributes, such as the CPU and the display size, and 13 app attributes, such as code size and UI complexity. Our study is based on data from 30 types of Android mobile devices and 280 Android apps. We use linear mixed effect models to identify the device attributes and app attributes with the strongest relationship with the user-perceived quality. We find that the code size has the strongest relationship with the user-perceived quality. However, some device attributes, such as the CPU, have stronger relationships with the user-perceived quality than some app attributes, such as the number of UI inputs and outputs of an app. Our work helps both device manufacturers and app developers. Manufacturers can focus on the attributes that have significant relationships with the user-perceived quality. Moreover, app developers should be careful about the devices for which they make their apps available because the device attributes have a strong relationship with the ratings that users give to apps.

Empirical Software Engineering | 2018

Examining the stability of logging statements

Suhas Kabinna; Cor-Paul Bezemer; Weiyi Shang; Mark D. Syer; Ahmed E. Hassan

Logging statements (embedded in the source code) produce logs that assist in understanding system behavior, monitoring choke-points and debugging. Prior work showcases the importance of logging statements in operating, understanding and improving software systems. The wide dependence on logs has lead to a new market of log processing and management tools. However, logs are often unstable, i.e., the logging statements that generate logs are often changed without the consideration of other stakeholders, causing sudden failures of log processing tools and increasing the maintenance costs of such tools. We examine the stability of logging statements in four open source applications namely: Liferay, ActiveMQ, Camel and CloudStack. We find that 20–45% of their logging statements change throughout their lifetime. The median number of days between the introduction of a logging statement and the first change to that statement is between 1 and 17 in our studied applications. These numbers show that in order to reduce maintenance effort, developers of log processing tools must be careful when selecting the logging statements on which their tools depend. In order to effectively mitigate the issues that are caused by unstable logging statements, we make an important first step towards determining whether a logging statement is likely to remain unchanged in the future. First, we use a random forest classifier to determine whether a just-introduced logging statement will change in the future, based solely on metrics that are calculated when it is introduced. Second, we examine whether a long-lived logging statement is likely to change based on its change history. We leverage Cox proportional hazards models (Cox models) to determine the change risk of long-lived logging statements in the source code. Through our case study on four open source applications, we show that our random forest classifier achieves a 83–91% precision, a 65–85% recall and a 0.95–0.96 AUC. We find that file ownership, developer experience, log density and SLOC are important metrics in our studied projects for determining the stability of logging statements in both our random forest classifiers and Cox models. Developers can use our approach to determine the risk of a logging statement changing in their own projects, to construct more robust log processing tools, by ensuring that these tools depend on logs that are generated by more stable logging statements.

international conference on software engineering | 2017

Analytics-driven load testing: an industrial experience report on load testing of large-scale systems

Tse-Hsun Chen; Mark D. Syer; Weiyi Shang; Zhen Ming Jiang; Ahmed E. Hassan; Mohamed N. Nasser; Parminder Flora

Assessing how large-scale software systems behave under load is essential because many problems cannot be uncovered without executing tests of large volumes of concurrent requests. Load-related problems can directly affect the customer-perceived quality of systems and often cost companies millions of dollars. Load testing is the standard approach for assessing how a system behaves under load. However, designing, executing and analyzing a load test can be very difficult due to the scale of the test (e.g., simulating millions of users and analyzing terabytes of data). Over the past decade, we have tackled many load testing challenges in an industrial setting. In this paper, we document the challenges that we encountered and the lessons that we learned as we addressed these challenges. We provide general guidelines for conducting load tests using an analytics-driven approach. We also discuss open research challenges that require attention from the research community. We believe that our experience can be beneficial to practitioners and researchers who are interested in the area of load testing.

conference of the centre for advanced studies on collaborative research | 2013