Edward B. Allen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Edward B. Allen is active.

Explore More

Publication

Featured researches published by Edward B. Allen.

IEEE Transactions on Neural Networks | 1997

Application of neural networks to software quality modeling of a very large telecommunications system

Taghi M. Khoshgoftaar; Edward B. Allen; John P. Hudepohl; Stephen J. Aud

Society relies on telecommunications to such an extent that telecommunications software must have high reliability. Enhanced measurement for early risk assessment of latent defects (EMERALD) is a joint project of Nortel and Bell Canada for improving the reliability of telecommunications software products. This paper reports a case study of neural-network modeling techniques developed for the EMERALD system. The resulting neural network is currently in the prototype testing phase at Nortel. Neural-network models can be used to identify fault-prone modules for extra attention early in development, and thus reduce the risk of operational problems with those modules. We modeled a subset of modules representing over seven million lines of code from a very large telecommunications software system. The set consisted of those modules reused with changes from the previous release. The dependent variable was membership in the class of fault-prone modules. The independent variables were principal components of nine measures of software design attributes. We compared the neural-network model with a nonparametric discriminant model and found the neural-network model had better predictive accuracy.

International Journal of Reliability, Quality and Safety Engineering | 1999

LOGISTIC REGRESSION MODELING OF SOFTWARE QUALITY

Taghi M. Khoshgoftaar; Edward B. Allen

Reliable software is mandatory for complex mission-critical systems. Classifying modules as fault-prone, or not, is a valuable technique for guiding development processes, so that resources can be focused on those parts of a system that are most likely to have faults. Logistic regression offers advantages over other classification modeling techniques, such as interpretable coefficients. There are few prior applications of logistic regression to software quality models in the literature, and none that we know of account for prior probabilities and costs of misclassification. A contribution of this paper is the application of prior probabilities and costs of misclassification to a logistic regression-based classification rule for a software quality model. This paper also contributes an integrated method for using logistic regression in software quality modeling, including examples of how to interpret coefficients, how to use prior probabilities, and how to use costs of misclassifications. A case study of a major subsystem of a military, real-time system illustrates the techniques.

IEEE Transactions on Reliability | 2002

Using regression trees to classify fault-prone software modules

Taghi M. Khoshgoftaar; Edward B. Allen; Jianyu Deng

Software faults are defects in software modules that might cause failures. Software developers tend to focus on faults, because they are closely related to the amount of rework necessary to prevent future operational software failures. The goal of this paper is to predict which modules are fault-prone and to do it early enough in the life cycle to be useful to developers. A regression tree is an algorithm represented by an abstract tree, where the response variable is a real quantity. Software modules are classified as fault-prone or not, by comparing the predicted value to a threshold. A classification rule is proposed that allows one to choose a preferred balance between the two types of misclassification rates. A case study of a very large telecommunications systems considered software modules to be fault-prone, if any faults were discovered by customers. Our research shows that classifying fault-prone modules with regression trees and the using the classification rule in this paper, resulted in predictions with satisfactory accuracy and robustness.

Proceedings 3rd IEEE Symposium on Application-Specific Systems and Software Engineering Technology | 2000

An application of fuzzy clustering to software quality prediction

Xiaohong Yuan; Taghi M. Khoshgoftaar; Edward B. Allen; K. Ganesan

The ever increasing demand for high software reliability requires more robust modeling techniques for software quality prediction. The paper presents a modeling technique that integrates fuzzy subtractive clustering with module-order modeling for software quality prediction. First fuzzy subtractive clustering is used to predict the number of faults, then module-order modeling is used to predict whether modules are fault-prone or not. Note that multiple linear regression is a special case of fuzzy subtractive clustering. We conducted a case study of a large legacy telecommunication system to predict whether each module will be considered fault-prone. The case study found that using fuzzy subtractive clustering and module-order modeling, one can classify modules which will likely have faults discovered by customers with useful accuracy prior to release.

ieee international software metrics symposium | 2001

Measuring coupling and cohesion of software modules: an information-theory approach

Edward B. Allen; Taghi M. Khoshgoftaar; Ye Chen

Coupling of a subsystem characterizes its interdependence with other subsystems. A subsystems cohesion, on the other hand, characterizes its internal interdependencies. When used in conjunction with other attributes, measurements of a subsystems coupling and cohesion can contribute to software quality models. An abstraction of a software system can be represented by a graph, and a module (subsystem) by a subgraph. Software design graphs depict components and their relationships. E.B. Allen and T.M. Khoshgoftaar (1999) proposed information theory-based measures of coupling and cohesion of a modular system. This paper proposes related information theory-based measures of coupling and cohesion of a module. These measures have the properties of module-level coupling and cohesion defined by Briand, Morasca and Basili (1997, 1999) . We define cohesion of a module in terms of intra-module coupling, normalized to between zero and one. We illustrate the measures with example graphs and an empirical analysis of the call graph of a moderate-sized C program, the Nethack computer game. Preliminary analysis showed that the information-theory approach has finer discrimination than counting.

ieee international software metrics symposium | 1999

Measuring coupling and cohesion: an information-theory approach

Edward B. Allen; Taghi M. Khoshgoftaar

The design of software is often depicted by graphs that show components and their relationships. For example, a structure chart shows the calling relationships among components. Object oriented design is based on various graphs as well. Such graphs are abstractions of the software, devised to depict certain design decisions. Coupling and cohesion are attributes that summarize the degree of interdependence or connectivity among subsystems and within subsystems, respectively. When used in conjunction with measures of other attributes, coupling and cohesion can contribute to an assessment or prediction of software quality. Let a graph be an abstraction of a software system and let a subgraph represent a module (subsystem). The paper proposes information theory based measures of coupling and cohesion of a modular system. These measures have the properties of system level coupling and cohesion defined by L.C. Briand et al. (1996; 1997). Coupling is based on relationships between modules. We also propose a similar measure for intramodule coupling based on an intramodule abstraction of the software, rather than intermodule, but intramodule coupling is calculated in the same way as intermodule coupling. We define cohesion in terms of intramodule coupling, normalized to between zero and one. We illustrate the measures with example graphs. Preliminary analysis showed that the information theory approach has finer discrimination than counting.

Empirical Software Engineering | 1998

Classification of Fault-Prone Software Modules: Prior Probabilities,Costs, and Model Evaluation

Taghi M. Khoshgoftaar; Edward B. Allen

Software quality models can give timely predictions of reliability indicators, for targeting software improvement efforts. In some cases, classification techniques are sufficient for useful software quality models.The software engineering community has not applied informed prior probabilities widely to software quality classification modeling studies. Moreover, even though costs are of paramount concern to software managers, costs of misclassification have received little attention in the software engineering literature. This paper applies informed prior probabilities and costs of misclassification to software quality classification. We also discuss the advantages and limitations of several statistical methods for evaluating the accuracy of software quality classification models.We conducted two full-scale industrial case studies which integrated these concepts with nonparametric discriminant analysis to illustrate how they can be used by a classification technique. The case studies supported our hypothesis that classification models of software quality can benefit by considering informed prior probabilities and by minimizing the expected cost of misclassifications. The case studies also illustrated the advantages and limitations of resubstitution, cross-validation, and data splitting for model evaluation.

IEEE Computer | 1998

Using process history to predict software quality

Taghi M. Khoshgoftaar; Edward B. Allen; Robert Halstead; Gary P. Trio; Ronald M. Flass

Many software quality models use only software product metrics to predict module reliability. For evolving systems, however, software process measures are also important. In this case study, the authors use module history data to predict module reliability in a subsystem of JStars, a real time military system.

Proceedings 3rd IEEE Symposium on Application-Specific Systems and Software Engineering Technology | 2000

Predicting testability of program modules using a neural network

Taghi M. Khoshgoftaar; Edward B. Allen; Zhiwei Xu

J.M. Voas (1992) defines testability as the probability that a test case will fail if the program has a fault. It is defined in the context of an oracle for the test, and a distribution of test cases, usually emulating operations. Because testability is a dynamic attribute of software, it is very computation-intensive to measure directly. The paper presents a case study of real time avionics software to predict the testability of each module from static measurements of source code. The static software metrics take much less computation than direct measurement of testability. Thus, a model based on inexpensive measurements could be an economical way to take advantage of testability attributes during software development. We found that neural networks are a promising technique for building such predictive models, because they are able to model nonlinearities in relationships. Our goal is to predict a quantity between zero and one whose distribution is highly skewed toward zero. This is very difficult for standard statistical techniques. In other words, high testability modules present a challenging prediction problem that is appropriate for neural networks.

Empirical Software Engineering | 2000

Balancing Misclassification Rates in Classification-TreeModels of Software Quality

Taghi M. Khoshgoftaar; Xiaojing Yuan; Edward B. Allen

Software product and process metrics can be useful predictorsof which modules are likely to have faults during operations.Developers and managers can use such predictions by softwarequality models to focus enhancement efforts before release.However, in practice, software quality modeling methods in theliterature may not produce a useful balance between the two kindsof misclassification rates, especially when there are few faultymodules.This paper presents a practical classificationrule in the context of classification tree models that allowsappropriate emphasis on each type of misclassification accordingto the needs of the project. This is especially important whenthe faulty modules are rare.An industrial case study using classification trees, illustrates the tradeoffs.The trees were built using the TREEDISC algorithm whichis a refinement of the CHAID algorithm. We examinedtwo releases of a very large telecommunications system, and builtmodels suited to two points in the development life cycle: theend of coding and the end of beta testing. Both trees had onlyfive significant predictors, out of 28 and 42 candidates, respectively.We interpreted the structure of the classification trees, andwe found the models had useful accuracy.

Explore More