Slide 6.1
TESTING
Slide 6.2
Overview
Quality issues
Non-execution-based testing
Execution-based testing
What should be tested?
Testing versus correctness proofs
Who should perform execution-based testing?
When testing stops
Slide 6.3
Testing
There are two basic types of testing
– Execution-based testing
– Non-execution-based testing
Slide 6.4
Testing (contd)
“V & V”
– Verification
» Determine if the workflow was completed correctly
– Validation
» Determine if the product as a whole satisfies its requirements
Slide 6.5
Testing (contd)
Warning
– The term “verify” is also used for all non-execution- based testing
Slide 6.6
6.1 Software Quality
Not “excellence”
The extent to which software satisfies its specifications
Every software professional is responsible for ensuring that his or her work is correct
– Quality must be built in from the beginning
Slide 6.7
6.1.1 Software Quality Assurance
The members of the SQA group must ensure that the developers are doing high-quality work
– At the end of each workflow – When the product is complete
In addition, quality assurance must be applied to
– The process itself
» Example: Standards
Slide 6.8
6.1.2 Managerial Independence
There must be managerial independence between
– The development group – The SQA group
Neither group should have power over the other
Slide 6.9
Managerial Independence (contd)
More senior management must decide whether to
– Deliver the product on time but with faults, or – Test further and deliver the product late
The decision must take into account the interests of the client and the development organization
Slide 6.10
6.2 Non-Execution-Based Testing
Underlying principles
– We should not review our own work – Group synergy
Slide 6.11
6.2.1 Walkthroughs
A walkthrough team consists of from four to six members
It includes representatives of
– The team responsible for the current workflow – The team responsible for the next workflow – The SQA group
The walkthrough is preceded by preparation
– Lists of items
» Items not understood
» Items that appear to be incorrect
Slide 6.12
6.2.2 Managing Walkthroughs
The walkthrough team is chaired by the SQA representative
In a walkthrough we detect faults, not correct them
– A correction produced by a committee is likely to be of low quality
– The cost of a committee correction is too high – Not all items flagged are actually incorrect
– A walkthrough should not last longer than 2 hours – There is no time to correct faults as well
Slide 6.13
Managing Walkthroughs (contd)
A walkthrough must be document-driven, rather than participant-driven
Verbalization leads to fault finding
A walkthrough should never be used for performance appraisal
Slide 6.14
6.2.3 Inspections
An inspection has five formal steps
– Overview
– Preparation, aided by statistics of fault types – Inspection
– Rework – Follow-up
Slide 6.15
Inspections (contd)
An inspection team has four members
– Moderator
– A member of the team performing the current workflow – A member of the team performing the next workflow – A member of the SQA group
Special roles are played by the
– Moderator – Reader – Recorder
Slide 6.16
Fault Statistics
Faults are recorded by severity
– Example:
» Major or minor
Faults are recorded by fault type
– Examples of design faults:
» Not all specification items have been addressed
» Actual and formal arguments do not correspond
Slide 6.17
Fault Statistics (contd)
For a given workflow, we compare current fault rates with those of previous products
We take action if there are a disproportionate number of faults in an artifact
– Redesigning from scratch is a good alternative
We carry forward fault statistics to the next workflow
– We may not detect all faults of a particular type in the current inspection
Slide 6.18
Statistics on Inspections
IBM inspections showed up
– 82% of all detected faults (1976) – 70% of all detected faults (1978) – 93% of all detected faults (1986)
Switching system
– 90% decrease in the cost of detecting faults (1986)
JPL
– Four major faults, 14 minor faults per 2 hours (1990) – Savings of $25,000 per inspection
– The number of faults decreased exponentially by phase (1992)
Slide 6.19
Statistics on Inspections (contd)
Warning
Fault statistics should never be used for performance appraisal
– “Killing the goose that lays the golden eggs”
Slide 6.20
6.2.4 Comparison of Inspections and Walkthroughs
Inspection
– Two-step, informal process
» Preparation
» Analysis
Walkthrough
– Five-step, formal process
» Overview
» Preparation
» Inspection
» Rework
» Follow-up
Slide 6.21
6.2.5 Strengths and Weaknesses of Reviews
Reviews can be effective
– Faults are detected early in the process
Reviews are less effective if the process is inadequate
– Large-scale software should consist of smaller, largely independent pieces
– The documentation of the previous workflows has to be complete and available online
Slide 6.22
6.2.6 Metrics for Inspections
Inspection rate (e.g., design pages inspected per hour)
Fault density (e.g., faults per KLOC inspected)
Fault detection rate (e.g., faults detected per hour)
Fault detection efficiency (e.g., number of major, minor faults detected per hour)
Slide 6.23
Metrics for Inspections (contd)
Does a 50% increase in the fault detection rate mean that
– Quality has decreased? Or
– The inspection process is more efficient?
Slide 6.24
6.3 Execution-Based Testing
Organizations spend up to 50% of their software budget on testing
– But delivered software is frequently unreliable
Dijkstra (1972)
– “Program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence”
Slide 6.25
6.4 What Should Be Tested?
Definition of execution-based testing
– “The process of inferring certain behavioral properties of the product based, in part, on the results of executing the product in a known environment with selected
inputs”
This definition has troubling implications
Slide 6.26
6.4 What Should Be Tested? (contd)
“Inference”
– We have a fault report, the source code, and — often — nothing else
“Known environment”
– We never can really know our environment
“Selected inputs”
– Sometimes we cannot provide the inputs we want – Simulation is needed
Slide 6.27
6.4 What Should Be Tested? (contd)
We need to test correctness (of course), and also
– Utility
– Reliability
– Robustness, and – Performance
Slide 6.28
6.4.1 Utility
The extent to which the product meets the user’s needs
– Examples:
» Ease of use
» Useful functions
» Cost effectiveness
Slide 6.29
6.4.2 Reliability
A measure of the frequency and criticality of failure
– Mean time between failures – Mean time to repair
– Time (and cost) to repair the results of a failure
Slide 6.30
6.4.3 Robustness
A function of
– The range of operating conditions
– The possibility of unacceptable results with valid input – The effect of invalid input
Slide 6.31
6.4.4 Performance
The extent to which space and time constraints are met
Real-time software is characterized by hard real- time constraints
If data are lost because the system is too slow
– There is no way to recover those data
Slide 6.32
6.4.5 Correctness
A product is correct if it satisfies its specifications
Slide 6.33
Correctness of specifications
Incorrect specification for a sort:
Function trickSort which satisfies this specification:
Figure 6.1
Figure 6.2
Slide 6.34
Correctness of specifications (contd)
Incorrect specification for a sort:
Corrected specification for the sort:
Figure 6.1 (again)
Figure 6.3
Slide 6.35
Correctness (contd)
Technically, correctness is
Not necessary
– Example: C++ compiler
Not sufficient
– Example: trickSort
Slide 6.36
6.5 Testing versus Correctness Proofs
A correctness proof is an alternative to execution- based testing
Slide 6.37
6.5.1 Example of a Correctness Proof
The code segment to be proven correct
Figure 6.4
Slide 6.38
Example of a Correctness Proof (contd)
A flowchart
equivalent of the code segment
Figure 6.5
Slide 6.39
Example of a Correctness Proof (contd)
Add
– Input specification – Output specification – Loop invariant
– Assertions
(See next slide)
Slide 6.40
Figure 6.6
Example of a Correctness Proof (contd)
Slide 6.41
Example of a Correctness Proof (contd)
An informal proof (using induction) appears in Section 6.5.1
Slide 6.42
6.5.2 Correctness Proof Mini Case Study
Dijkstra (1972):
– “The programmer should let the program proof and program grow hand in hand”
“Naur text-processing problem” (1969)
Slide 6.43
Naur Text-Processing Problem
Given a text consisting of words separated by a
blank or by newline characters, convert it to line-by- line form in accordance with the following rules:
Line breaks must be made only where the given text contains a blank or newline
Each line is filled as far as possible, as long as
No line will contain more than maxpos characters
Slide 6.44
Episode 1
Naur constructed a 25-line procedure
He informally proved its correctness
Slide 6.45
Episode 2
1970 — Reviewer in Computing Reviews
– The first word of the first line is preceded by a blank
unless the first word is exactly maxpos characters long
Slide 6.46
Episode 3
1971 — London finds 3 more faults
Including:
– The procedure does not terminate unless a word longer than maxpos characters is encountered
Slide 6.47
Episode 4
1975 — Goodenough and Gerhart find three further faults
Including:
– The last word will not be output unless it is followed by a
blank or newline
Slide 6.48
Correctness Proof Mini Case Study (contd)
Lesson:
Even if a product has been proven correct, it must still be tested
Slide 6.49
6.5.3 Correctness Proofs and Software Engineering
Three myths of correctness proving (see over)
Slide 6.50
Three Myths of Correctness Proving
Software engineers do not have enough mathematics for proofs
– Most computer science majors either know or can learn the mathematics needed for proofs
Proving is too expensive to be practical
– Economic viability is determined from cost–benefit analysis
Proving is too hard
– Many nontrivial products have been successfully proven – Tools like theorem provers can assist us
Slide 6.51
Difficulties with Correctness Proving
Can we trust a theorem prover ?
Figure 6.7
Slide 6.52
Difficulties with Correctness Proving (contd)
How do we find input–output specifications, loop invariants?
What if the specifications are wrong?
We can never be sure that specifications or a verification system are correct
Slide 6.53
Correctness Proofs and Software Engineering (contd)
Correctness proofs are a vital software engineering tool, where appropriate:
– When human lives are at stake
– When indicated by cost–benefit analysis – When the risk of not proving is too great
Also, informal proofs can improve software quality
– Use the assert statement
Model checking is a new technology that may eventually take the place of correctness proving (Section 18.11)
Slide 6.54
6.6 Who Should Perform Execution-Based Testing?
Programming is constructive
Testing is destructive
– A successful test finds a fault
So, programmers should not test their own code artifacts
Slide 6.55
Who Should Perform Execution-Based Testing? (contd)
Solution:
– The programmer does informal testing
– The SQA group then does systematic testing – The programmer debugs the module
All test cases must be
– Planned beforehand, including the expected output, and – Retained afterwards
Slide 6.56
6.7 When Testing Stops
Only when the product has been irrevocably discarded