Vincent Yun Shen
Purdue University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vincent Yun Shen.
IEEE Transactions on Software Engineering | 1985
Vincent Yun Shen; Tze-Jie Yu; Stephen M. Thebaut; Lorri R. Paulsen
A major portion of the effort expended in developing commercial software today is associated with program testing. Schedule and/ or resource constraints frequently require that testing be conducted so as to uncover the greatest number of errors possible in the time allowed. In this paper we describe a study undertaken to assess the potential usefulness of various product-and process-related measures in identifying error-prone software. Our goal was to establish an empirical basis for the efficient utilization of limited testing resources using objective, measurable criteria. Through a detailed analysis of three software products and their error discovery histories, we have found simple metrics related to the amount of data and the structural complexity of programs to be of value for this purpose.
IEEE Transactions on Software Engineering | 1983
Vincent Yun Shen; Samuel D. Conte; Hubert E. Dunsmore
The theory of software science was developed by the late M. H. Halstead of Purdue University during the early 1970s. It was first presented in unified form in the monograph Elements of Software Science published by Elsevier North-Holland in 1977. Since it claimed to apply scientific methods to the very complex and important problem of software production, and since experimental evidence supplied by Halstead and others seemed to support the theory, it drew widespread attention from the computer science community.
IEEE Transactions on Software Engineering | 1988
Tze-Jie Yu; Vincent Yun Shen; Hubert E. Dunsmore
Results are presented of an analysis of several defect models using data collected from two large commercial projects. Traditional models typically use either program matrices (i.e. measurements from software products) or testing time or combinations of these as independent variables. The limitations of such models have been well-documented. The models considered use the number of defects detected in the earlier phases of the development process as the independent variable. This number can be used to predict the number of defects to be detected later, even in modified software products. A strong correlation between the number of earlier defects and that of later ones was found. Using this relationship, a mathematical model was derived which may be used to estimate the number of defects remaining in software. This defect model may also be used to guide software developers in evaluating the effectiveness of the software development and testing processes. >
Journal of Systems and Software | 1981
Scott N. Woodfield; Vincent Yun Shen; Hubert E. Dunsmore
As the cost of programming becomes a major component of the cost of computer systems, it becomes imperative that program development and maintenance be better managed. One measurement a manager could use is programming complexity. Such a measure can be very useful if the manager is confident that the higher the complexity measure is for a programming project, the more effort it takes to complete the project and perhaps to maintain it. Until recently most measures of complexity were based only on intuition and experience. In the past 3 years two objective metrics have been introduced, McCabes cyclomatic number v(G) and Halsteads effort measure E. This paper reports an empirical study designed to compare these two metrics with a classic size measure, lines of code. A fourth metric based on a model of programming is introduced and shown to be better than the previously known metrics for some experimental data.
Advances in Computers | 1985
Samuel D. Conte; Hubert E. Dunsmore; Vincent Yun Shen
Publisher Summary This chapter is concerned about effort and cost estimation models which are appropriate for software project development. Project development is meant to include life cycle phases from project design, through system integration, to testing and software delivery. It discusses and evaluates several models for software effort estimation and performance of these models is compared on sets of projects for which some information is available. The models are categorized into (1) historical-experiential models, (2) statistically-based models, (3) theoretically-based models, and (4) composite models. There is some hope that effort and cost models so restricted can be developed which are transportable from one organization to another. The chapter concludes that further experimentation, the gathering of more data, and the combining and enhancing of models will be necessary in order to allow computer scientists to explain and better control the software development process.
Software - Practice and Experience | 1982
Douglas E. Comer; Vincent Yun Shen
When a document is prepared using a computer system, it can be checked for spelling errors automatically and efficiently. This paper reviews and compares several methods for searching an English spelling dictionary. It also presents a new technique, hash‐bucket search, for searching a static table in general, and a dictionary in particular. Analysis shows that with only a small amount of space beyond that required to store the keys, the hash‐bucket search method has many advantages over existing methods. Experimental results with a sample dictionary using double hashing and the hash‐bucket techniques are presented.
Information Processing and Management | 1984
Stephen M. Thebaut; Vincent Yun Shen
Abstract Recent work conducted by members of the Purdue Software Metrics Research Group has focused on the complexity associated with coordinating the activities of persons involved in large-scale programming efforts. A resource model is presented which is designed to reflect the impact of this complexity on the economics of software development. The model is based on a formulation in which development effort is functionally related to measures of product size and manloading. The particular formulation used is meant to suggest a logical decomposition of development effort into components related to the independent programming activity of individuals and to the overhead associated with the required information flow within a programming team. The model is evaluated in light of acquired data reflecting a large number of commercially developed software products from two separate sources. Additional sources of data are actively being sought. Although strongly analytic in nature, the models performance is, for the available data, at least as good in accounting for the observed variablility in development effort as some highly publicized empirically based models of comparable complexity. It is argued, however, that the models principle strength lies not in its data fitting ability, but rather in its straight forward and intuitively appealing representation of relationships involving manpower, time, and effort.
measurement and modeling of computer systems | 1982
Samuel D. Conte; Vincent Yun Shen; K. Dickey
Halstead in his Theory of Software Science, proposed that in the Fortran language, each occurrence of a GOTO i for different label is be counted as a unique operator. Several writers have questioned the wisdom of this method of counting GOTOs. In this paper, we investigate the effect of counting GOTOs as several occurrences of a single unique operator on various software science metrics. Some 412 modules from the International Mathematical and Statistical Libraries (IMSL) are used as the data base for this study.
Journal of Systems and Software | 1988
Tze-Jie Yu; Brian A. Nejmeh; Hubert E. Dunsmore; Vincent Yun Shen
This paper presents the data and capabilities provided by the Software Metrics Data Collection (SMDC) system. SMDC is an APL-based system that runs on the UNIXTM 4.3BSD system at Purdue University. The data stored in SMDC were collected from hundreds of 1. (1) software products developed at industrial environments and 2. (2)experiments conducted at Purdue University. The largest software product in SMDC has more than 1,000,000 lines of code. SMDC also provides a number of statistical functions and plotting routines that can be used for detailed analysis of existing data. The data and tools in SMDC are available for use by non-Purdue researchers with some limitations.
ACM Sigsoft Software Engineering Notes | 1982
John W. Bailey; Kenneth J. Christensen; Helmut Krcmar; Jean-Louis Lassez; Vincent Yun Shen; Scott N. Woodfield
Various methods exist for estimating the complexity of single modules and the complexity of interacting modules. We propose to demonstrate how the degree of programmer inter-communication in a multi-programmer project also has a significant effect on productivity. Specifically, projects which use programming team structures which require more inter-programmer communication will experience lower productivities than projects which use team structures which minimize the need for inter-programmer communications.The design proposed here involves the simultaneous re-implementation of an existing, well-documented design by teams of seven graduate-level students. It offers the advantage of alternate levels of work depending upon the amount of data collection and analysis to be underwritten. The results of even the least demanding study will be of significance to any medium to large scale development environment as a guide to selecting programming team structure. The more involved study will give more general results as to the effects of unconstrained communication paths in a team environment.