Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where K. El Emam is active.

Publication


Featured researches published by K. El Emam.


international conference on software engineering | 1999

An assessment and comparison of common software cost estimation modeling techniques

Lionel C. Briand; K. El Emam; D. Surmann; Isabella Wieczorek; K.D. Maxwell

This paper investigates two essential questions related to data-driven, software cost modeling: (1) What modeling techniques are likely to yield more accurate results when using typical software development cost data? and (2) What are the benefits and drawbacks of using organization-specific data as compared to multi-organization databases? The former question is important in guiding software cost analysts in their choice of the right type of modeling technique, if at all possible. In order to address this issue, we assess and compare a selection of common cost modeling techniques fulfilling a number of important criteria using a large multi-organizational database in the business application domain. Namely, these are: ordinary least squares regression, stepwise ANOVA, CART, and analogy. The latter question is important in order to assess the feasibility of using multi-organization cost databases to build cost models and the benefits gained from local, company-specific data collection and modeling. As a large subset of the data in the multi-company database came from one organization, we were able to investigate this issue by comparing organization-specific models with models based on multi-organization data. Results show that the performances of the modeling techniques considered were not significantly different, with the exception of the analogy-based models which appear to be less accurate. Surprisingly, when using standard cost factors (e.g., COCOMO-like factors, Function Points), organization specific models did not yield better results than generic, multi-organization models.


international conference on software engineering | 1998

COBRA: a hybrid method for software cost estimation, benchmarking, and risk assessment

Lionel C. Briand; K. El Emam; Frank Bomarius

Current cost estimation techniques have a number of drawbacks. For example, developing algorithmic models requires extensive past project data. Also, off-the-shelf models have been found to be difficult to calibrate but inaccurate without calibration. Informal approaches based on experienced estimators depend on estimators availability and are not easily repeatable, as well as not being much more accurate than algorithmic techniques. We present a method for cost estimation that combines aspects of algorithmic and experiential approaches (referred to as COBRA, COst estimation, Benchmarking, and Risk Assessment). We find through a case study that cost estimates using COBRA show an average ARE of 0.09. Although we do not have the room to describe the benchmarking and risk assessment parts, the reader will find detailed information in (Briand et al., 1997).


international symposium on software reliability engineering | 1998

The repeatability of code defect classifications

K. El Emam; I. Wieczorek

Counts of defects found during the various defect defection activities in software projects and their classification provide a basis for product quality evaluation and process improvement. However, since defect classifications are subjective, it is necessary to ensure that they are repeatable (i.e., that the classification is not dependent on the individual). We evaluate a slight adaptation of a commonly used defect classification scheme that has been applied in IBMs Orthogonal Defect Classification work, and in the SEIs Personal Software Process. The evaluation utilizes the Kappa statistic. We use defect data from code inspections conducted during a development project. Our results indicate that the classification scheme is in general repeatable. We further evaluate classes of defects to find out if confusion between some categories is more common, and suggest a potential improvement to the scheme.


international symposium on software reliability engineering | 1997

Quantitative evaluation of capture-recapture models to control software inspections

Lionel C. Briand; K. El Emam; B. Frelmut; Oliver Laitenberger

An important requirement to control the inspection of software artifacts is to be able to decide, based on objective information, whether inspection can stop or whether it should continue to achieve a suitable level of artifact quality. Several studies in software engineering have considered the use of capture-recapture models to predict the number of remaining defects in an inspected document as a decision criterion about reinspection. However, no study on software engineering artifacts compares the actual number of remaining defects to the one predicted by a capture-recapture model. Simulations have been performed but no definite conclusions can be drawn regarding the degree of accuracy of such models under realistic inspection conditions, and the factors affecting this accuracy. Furthermore, none of these studies performed an exhaustive comparison of existing models. In this study, we focus on traditional inspections and estimate, based on actual inspection data, the degree of accuracy of all relevant, state-of-the-art, capture-recapture models for which statistical estimators exist. We compare the various models accuracies and look at the impact of the number of inspectors on these accuracies. Results show that model accuracies are strongly affected by the number of inspectors and, therefore, one must consider this factor before using capture-recapture models. When the number of inspectors is below 4, no model is sufficiently accurate and underestimation may be substantial. In addition, some models perform better than others in a large number of conditions and plausible reasons are discussed. Based on our analyses, we recommend using a model taking into account different probabilities of detecting defects and a Jacknife estimator.


international conference on software engineering | 1999

Explaining the cost of European space and military projects

Lionel C. Briand; K. El Emam; Isabella Wieczorek

There has been much controversy in the literature on several issues underlying the construction of parametric software development cost models. For example, it has been argued whether (dis)economies of scale exist in software production, what functional form should be assumed between effort and product size, whether COCOMO factors were useful, and whether the COCOMO factors are independent. Answers to such questions should help software organizations define suitable data collection programs and well-specified cost models. We use a data set collected by the European Space Agency to perform such an investigation. To ensure a certain degree of consistency in our data, we focus our analysis on a set of space and military projects that represent an important application domain and the largest subset in the database. These projects have been performed, however, by a variety of organizations. First, our results indicate that two functional forms are plausible between effort and product size: linear and log-linear. This also means that different project subpopulations are likely to follow different functional forms. Second, besides product size, the strongest factor influencing cost appears to be team size. Larger teams result in substantially lower productivity, which is interesting considering this attribute is rarely collected in software engineering cost databases. Third, although some COCOMO factors appear to be useful and significant covariates, they play a minor role in explaining project effort.


ieee international software metrics symposium | 1998

The internal consistency of the ISO/IEC 15504 software process capability scale

K. El Emam

ISO/IEC 15504 is an emerging international standard for software process assessment. It has undergone a major change in the rating scale used to measure the capability of processes. The objective of this paper is to present a follow up evaluation of the internal consistency of this process capability scale. Internal consistency is a form of reliability of a subjective measurement instrument. A previous study evaluated the internal consistency of the first version of the ISO/IEC 15504 document set (also known as SPICE version 1). In the current study we evaluate the internal consistency of the second version (also known as ISO/IEC PDTR 15504). Our results indicate that the internal consistency of the capability dimension did not deteriorate, and that it is still sufficiently high for practical purposes. Furthermore, we identify that the capability scale has two dimensions that we termed Process Implementation and Quantitative Process Management.ISO/IEC 15504 is an emerging international standard for software process assessment. It has undergone a major change in the rating scale used to measure the capability of processes. The objective of this paper is to present a follow up evaluation of the internal consistency of this process capability scale. Internal consistency is a form of reliability of a subjective measurement instrument. A previous study evaluated the internal consistency of the first version of the ISO/IEC 15504 document set (also known as SPICE version 1). In the current study we evaluate the internal consistency of the second version (also known as ISO/IEC PDTR 15504). Our results indicate that the internal consistency of the capability dimension did not deteriorate, and that it is still sufficiently high for practical purposes. Furthermore, we identify that the capability scale has two dimensions that we termed Process Implementation and Quantitative Process Management.


Proceedings of Software Process 1996 | 1996

Interrater agreement in SPICE-based assessments: some preliminary results

K. El Emam; Dennis R. Goldenson; Lionel C. Briand; P. Marshall

The international SPICE Project intends to deliver an ISO standard on software process assessment. This project is unique in software engineering standards in that there is a set of empirical trials, the objectives of which are to evaluate the prospective standard and provide feedback before standardization. One of the enduring issues being evaluated during the trials is the reliability of assessments based on SPICE. One element of reliability is the extent to which different teams assessing the same processes produce similar ratings when presented with the same evidence. We present some preliminary results from two assessments conducted during the SPICE trials. In each of these assessments two independent teams performed the same ratings. The results indicate that in general there is at least moderate agreement between the two teams in both cases. When we take into account the severity of disagreement then the extent of agreement between the two teams is almost perfect. Also, our results indicated that interrater agreement is not the same for different SPICE processes. The findings reported in this paper provide guidance for future studies of interrater agreement in the SPICE trials and also indicate some potential issues that need to be considered within the prospective standard.


Proceedings of Software Process 1996 | 1996

Implementing concepts from the Personal Software Process in an industrial setting

K. El Emam; B. Shostak; Nazim H. Madhavji

The Personal Software Process (PSP) has been taught at a number of universities with impressive results. If is also of interest to industry as a means for training their software engineers. While there are published reports on the teaching of PSP in classroom settings (at universities and industry), little systematic study has been conducted on the implementation of PSP in industry. Also, largely anecdotal evidence exists as to its effectiveness with real programming tasks. Effectiveness is measured in terms of the number of trained engineers who actually use PSP in their daily work, and improvements in productivity and defect removal. We report on a study of the implementation of some PSP concepts in a commercial organization. The empirical enquiry method that we employed was action research. Our results identify the problems that were encountered during the four major activities of an implementation of PSP: planning, training, evaluation, and leveraging. We describe how these problems were addressed, and the general lessons learned from the implementation. An overall transfer of PSP training rate of 46.5% was achieved. For the engineers in our study, those who applied all of the taught PSP concepts on-the-job improved their defect detection capabilities.


ieee international software metrics symposium | 1998

Cost implications of interrater agreement for software process assessments

K. El Emam; J.-M. Simon; S. Rousseau; E. Jacquet

Much empirical research has been done on evaluating and modeling interrater agreement in software process assessments. Interrater agreement is the extent to which assessors agree in their ratings of software process capabilities when presented with the same evidence and performing their ratings independently. This line of research was based on the premise that lack of interrater agreement can lead to erroneous decisions from process assessment scores. However, thus far we do not know the impact of interrater agreement on the cost of assessments. We report on a study that evaluates the relationship between interrater agreement and the cost of the consolidation activity in assessments. The study was conducted in the context of two assessments using the emerging international standard ISO/IEC 15504. Our results indicate that for organizational processes, the relationship is strong and in the expected direction. For project level processes no relationship was found. These results indicate that for assessments that include organizational processes in their scope, ensuring high interrater agreement could lead to a reduction in their costs.


ieee international software metrics symposium | 1998

The predictive validity criterion for evaluating binary classifiers

K. El Emam

The development of binary classifiers to identify highly error-prone or high maintenance cost components is increasing in the software engineering quality modeling literature and in practice. One approach for evaluating these classifiers is to determine their ability to predict the classes of unseen cases, i.e., predictive validity. A chi-square statistical test has been frequently used to evaluate predictive validity. We illustrate that this test has a number of disadvantages. The disadvantages include a difficulty in using the results of the test to determine whether a classifier is a good predictor, demonstrated through a number of examples, and a rather conservative Type I error rate, demonstrated through a Monte Carlo simulation. We present an alternative test that has been used in the social sciences for evaluating agreement with a gold standard. The use of this alternative test is illustrated in practice by developing a classification model to predict maintenance effort for an object oriented system, and evaluating its predictive validity on data from a second object-oriented system in the same environment.The development of binary classifiers to identify highly error-prone or high maintenance cost components is increasing in the software engineering quality modeling literature and in practice. One approach for evaluating these classifiers is to determine their ability to predict the classes of unseen cases, i.e., predictive validity. A chi-square statistical test has been frequently used to evaluate predictive validity. We illustrate that this test has a number of disadvantages. The disadvantages include a difficulty in using the results of the test to determine whether a classifier is a good predictor, demonstrated through a number of examples, and a rather conservative Type I error rate, demonstrated through a Monte Carlo simulation. We present an alternative test that has been used in the social sciences for evaluating agreement with a gold standard. The use of this alternative test is illustrated in practice by developing a classification model to predict maintenance effort for an object oriented system, and evaluating its predictive validity on data from a second object-oriented system in the same environment.

Collaboration


Dive into the K. El Emam's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nazim H. Madhavji

University of Western Ontario

View shared research outputs
Top Co-Authors

Avatar

Dennis R. Goldenson

Software Engineering Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge