Research Method - 軟體評估模式建構技術適用性動態模式之研究(II)

3.1 Factor Framework for the Adaptability of Software Assessment Models

The main goal of this research is to provide an easily-extensible factor framework for the adaptability of software assessment models and their associated techniques. For software researchers, they can easily use this framework to validate their models or techniques for clearly exploring the pre-conditions of applying them. In the case of project managers, they can definitely distinguish their problems through our framework for easily choosing the suitable models or techniques.

In order to increase the extensibility of our framework and the integrity between its factors, we introduce a rule-based framework for this study. The adopters can effortlessly add new factor without substantially reforming or modifying the antecedent framework and handily form their problems with these factors and logical operators (e.g. “AND” and “OR”

etc.). The proposed factors in this study are described as follows:

z Regarding dependent variables

 Scale type of the dependent variables: nominal, ordinal, interval, ratio or absolute scales.

 Amount (nominal and ordinal scales) or Range (the remainder scales) of values in a dependent variable.

 Vagueness level of the dependent variables.

 Uncertainty level of the dependent variables.

 Distribution type of values in the dependent variables: discrete uniform distribution, normal distribution, uniform distribution, etc.

z Regarding independent variables

 Amount of whole independent variables in a model.

 Distribution of different types of independent variables in a model.

 Based on different scale types of independent variables in a model:

nominal, ordinal, interval, ratio and absolute scales.

 Based on different types of value range of independent variables in a model.

 Based on different vagueness levels of independent variables in a model.

 Based on different uncertainty levels of independent variables in a model.

 Based on different distribution types of independent variables in a model:

discrete uniform distribution, normal distribution, uniform distribution, etc.

z Problems with no objective, single objective, multiple non-conflict objectives, or multiple conflict objectives.

z Quality (the ratio of missing data or outliers) and amount of historical data.

z Based on a model, the acceptable levels of assessment accuracy and efficiency for an appraiser.

z Number of assessment layers required for reaching the decision in an assessment model (such as the one-layered comprehensive assessment and the multi-layered iterative assessment).

z The others.

3.2 Comparisons of Multi-Layered and One-Layered Assessment Models Based on Empirical Software Measurement Dataset

In this sub-section, we attempt to demonstrate the practicability of this framework and choose a main factor, namely the number of assessment layers required for reaching the decision in an assessment model, for this experiment. We then classify these software assessment models into two types: one-layered and multi-layered software assessment models.

Most existing software assessment models are one-layered, which requires all measurement data available for generating an assessment result; while a multi-layered software assessment model provides more than one assessment layer for the appraiser

to generate the derived assessment result. That is, such a model allows the appraiser to gradually collect measurement data when performing the assessment. Intuitively, the one-layered assessment model requires more effort in collecting all measurement data than the multi-layered assessment model. On the contrary, the one-layered model applies more complete measurement data for higher possibility to produce better assessment accuracy than the multi-layered model.

However, the assessment activities usually need to be implemented repeatedly throughout the software development life cycle. Meanwhile, the collection of software measurement data is generally strenuous and labor consuming. It is essential for software project managers to determine which assessment model is suitable for their respective objectives, since such a choice would significantly affect the assessment efficiency, namely effort affected by the amount of measurement data to be collected, and the accuracy of the assessment result. Therefore, this study aims at comparing the assessment accuracy and efficiency for the multi-layered and one-layered models.

3.2.1 Research Procedure

The comparison of performance between one-layered and multi-layered assessment models comprises the following four steps.

Step1: Define two types of assessment models and select the criteria for adopted techniques in this study.

1. Define one-layered and multi-layered assessment models

A one-layered assessment model has only one assessment layer, in which all measurement data should be available when performing the assessment. On the contrary, a multi-layered assessment model provides more than one assessment layer for the appraiser to generate the derived assessment result. That is, such a model allows the appraiser to gradually collect measurement data when performing the assessment. If the expected assessment accuracy is achieved, the assessment is then completed. Hence, the appraiser may not need to prepare all measurement data in the first layer of the model, but only collect necessary measurement data at each particular assessment layer.

2. Decide the factors for selecting the modeling techniques

The following three factors are used to choose the modeling techniques for these two assessment models.

(1) The selected techniques should be widely used in software-related classification studies.

(2) The selected techniques should automatically select the appropriate independent variables in the model, so that the efficiency of the assessment models can be objectively compared.

(3) The selected techniques can be found in commercial-off-the-shelf (COTS) software packages, so that researchers and practitioners can easily use these packages for building the models.

Step2: Determine assessment accuracy and efficiency indicators for comparing the performance of these two software assessment models, as shown in Section 3.3.

Step3: Preprocess the empirical software measurement data. The three-fold cross-validation approach, a widely adopted method in the literature [42], was used to establish and validate the models. Project data are randomly arranged into three groups of equal number via stratified sampling, so the proportions of training and test data can be better fitted into the distribution of raw data for more stable and realistic accurate measures. Two groups are used as the training dataset and the remaining group is treated as the test dataset. That is, two-thirds of the project data are used to build the model and one-third is adopted to validate the performance of the established model. The above procedure is repeated three times in different combinations. Then, all test dataset accuracy is aggregated as the test result.

Step4: Establish two types of software assessment models and compare their performances.

The training datasets prepared in step 3 are used to establish two different software assessment models by means of the identified techniques in step 1. Then, the performance of these software assessment models would be respectively computed at the test stages. Finally, experimental results are presented and further discussions and conclusions are accordingly drawn.

3.2.2 The Selected Techniques

According to the identified factors in step 1 of Section 3.1, four classification modeling techniques were determined for establishing two types of software assessment models. The techniques include ANN, DA, DT and LR. The taxonomies and brief descriptions of these four techniques are given as follows.

3.2.2.1 Classification Techniques for Building the One-layered Assessment Model

z Artificial Neural Network (ANN)

An ANN-based software assessment model is presented as a set of interconnected input/output neurons, where each connection has a weight associated with it for simulating bio-learning behaviors in an assessment. Among various types of ANN models, the most known model in previous software-related classification studies is the feed-forward back-propagation ANN (FF-BP ANN). At the training stage, weights are computed by using the feed-forward back-propagation algorithm, which is a form of gradient descent method. In this study, Saha’s FF-BP ANN classifier, written by Excel VBA, was utilized to establish the ANN-based one-layered software assessment model. The more detailed descriptions of FF-BP ANN can be referred to [43].

z Discriminant Analysis (DA)

In previous software-related classification studies, DA is also one of the widely-used methods. Given a set of independent variables, the DA-based software assessment model would attempt to find the suitable linear combination of those variables. This model builds the discriminant functions by using the ordinary least-squares and stepwise methods, as well as the linear regression method. In this study, the DA module in the SPSS package is utilized to build the DA-based one-layered software assessment model with the smallest F ratio stepwise method. The more detailed descriptions on DA can be referred to [44].

z Logistic Regression (LR)

LR is a generalization of the linear regression method with a discrete outcome. Due to such a discrete dependent variable, a LR-based software assessment model cannot be directly established with conventional linear regression approach. Hence, instead of predicting whether the outcome of the dependent variable will occur, the model is built to predict the logarithm of the odds of outcome occurrences based on the maximum likelihood estimation and some variable selection methods. Besides, the LR software assessment model is also popular in that it can enable researchers to overcome several restrictive assumptions of the ordinary least squares (OLS) regression and discriminant analysis methods. In this study, the multi-nominal LR (MNLR) in the SPSS package was adopted to establish the LR-based one-layered software assessment model with the forward entry stepwise method. The more detailed descriptions on MNLR can be referred to [44].

3.2.2.2 Classification Techniques for Building the Multi-layered Assessment Model

z Decision Tree (DT)

Based on the historical data, a DT-based software assessment model is presented as a tree graph or multi-layered rules to classify new unseen cases. It provides various assessment paths or layers for assessors to specially treat each new case. So far, many types of the decision tree algorithms were proposed in the literature. They were also widely introduced in the software-related classification studies. Some examples are C4.5 [21], ID3 [18, 19], SPRINT-SLIQ [22], and TREEDISC [20].

In this study, Quinlan’s C4.5 algorithm of the 8th release (C4.5-R8) [45] was utilized for establishing the DT-based multi-layered software assessment model, as it is well-known in classification studies and can easily handle different splitting problems for the ordinal independent variables. The C4.5-R8 is a descendent of the induction programs: ID3 (Quinlan, 1979) and C4 (Quinlan, 1987) [46]. Its more detailed descriptions on C4.5 can be referred to [46].

3.2.3 Model Performance Indicators

As identified in Step2 of Section 3.1, two indicators are used to compare the performance of one-layered and multi-layered software assessment models. They are the assessment accuracy and assessment efficiency, with detailed descriptions presented as follows.

3.2.3.1 Assessment Accuracy

The measures of overall and partial misclassification rates were widely used indicators in previous software-related classification studies [23, 40], and thus are adopted to evaluate and compare the assessment accuracy of two software assessment models in this study. The lower the misclassification rate, the better the accuracy of the assessment model. Generally, software experts find it hard to determine the misclassified costs (also called the weights) of all classes for integrating partial misclassification rates into overall misclassification rate, especially in the situations with more than two classes. Therefore, equally-weighted overall misclassification rate (EW-OMR) is treated as a measure for selecting the best training model as shown in Eq. (1).

The other adopted assessment accuracy measure in this study is the partial

misclassification rates, which can be divided into Type I and Type II misclassification rates (Type I MR and Type II MR) for a two-classed dependent variable. The formulas are defined as Eqs. (2) and (3).

Three measures for comparing the assessment efficiency of two software assessment models are used in this study. These are “the total number of uniquely used variables in an assessment model” (TNoUVarsM), “the average number of used variables within all assessment paths” (ANoUVarsAP), and “the number of uniquely used variables in the longest assessment path” (NoUVarsLAP). The TNoUVarsM measures the variable utility of an assessment model. The ANoUVarsAP and NoUVarsLAP depict the average and worst assessment efficiencies of respective assessment model. It is noted that an assessment path represents a complete assessment, and the longest assessment path refers to the maximal number of the uniquely used variables in a model to finish a complete assessment. The formula of ANoUVarsAP is defined as Eq. (4).

model

3.2.4 An Empirical Software Measurement Dataset

An empirical software measurement dataset, comprising 115 historical software projects [47], were adopted in this study. The profiles of this dataset are summarized in Table 1. The dependent variable is “whether the project is cost risk-prone or non-risk-prone”. The cost risk is defined as if the final product of a software project can be delivered within budget. The proportion of the cost risk-prone projects is 28.7%. The value of the cost risk-prone project class is given as 1, and the value of the risk non-risk-prone project class is given as 2.

Table 1: Profiles of the collected 115 software projects Project profiles Mean Std. Dev. Min. Max.

Project duration (months) 12 10.84 1 65 Delay Time (percentage) 25.42 63.10 0 73.85 Cost overlay (percentage) 12.29 2.44 3 20 Staff turnover (percentage) 12.14 17.80 0 100

The independent variables in this study include 27 risk factors, which are classified into six dimensions. Each of them presents an exposure to an individual risk factor, which is defined as its impact on the project cost multiplied by its occurrence frequency. The measurement scale for the impact of an individual risk factor on the project cost has five levels - minimal or no impact (value=1), <5% (value=2), 5-7% (value=3), 7-10% (value=4) and >10% (value=5); while the measurement scale for the occurrence frequency of an individual risk factor also has five levels - remote (value=1), unlikely (value=2), likely (value=3), highly likely (value=4) and near certainty (value=5). The profiles of the 27 independent variables are shown in Table 2.

Table 2: Profiles of the 27 risk factors within the collected 115 software projects Risk factors grouped by six dimensions Mean Std.

Dev. Min. Max.

Organizational environment risks

1. Change in organizational management during the project 6.16 5.62 1 25 2. Corporate politics with negative effect on project 6.97 6.42 1 25

3. Unstable organizational environment 5.25 5.62 1 25

4. Organization undergoing restructuring during the project 4.75 5.29 1 25 User risks

5. Users resistant to change 6.21 4.41 1 20

6. Conflict between users 6.34 5.11 1 20

7. Users with negative attitudes toward the project 5.32 5.57 1 25

8. Users not committed to the project 6.03 5.04 1 25

9. Lack of cooperation from users 5.83 5.03 1 25

Requirements risks

10. Continually changing system requirements 11.27 7.07 1 25 11 .System requirements not adequately identified 8.89 6.68 1 25

12. Unclear system requirements 8.68 7.06 1 25

13. Incorrect system requirements 7.57 7.07 1 25 Project complexity risks

14. Project involved the use of new technology 7.44 5.84 1 25

15. High level of technical complexity 6.82 5.57 1 25

16. Immature technology 5.33 5.53 1 25

17. Project involves use of technology that has not been used in

prior projects 7.18 6.47 1 25

Planning & control risks

18. Lack of an effective project management methodology 8.23 7.01 1 25 19. Project progress not monitored closely enough 7.20 6.69 1 25 20. Inadequate estimation of required resources 7.60 6.30 1 25

21. Poor project planning Construct 7.23 5.86 1 25

22. Project milestones not clearly defined 5.60 5.36 1 25

23. Inexperienced project manager 6.25 6.17 1 25

24. Ineffective communication 7.13 6.73 1 25

Team risks

25. Inadequately trained development team members 6.63 5.44 1 25

26. Inexperienced team members 6.48 4.77 1 20

27. Team members lack specialized skills required by the project 6.14 5.34 1 25

在文檔中軟體評估模式建構技術適用性動態模式之研究(II) (頁 8-16)