Pairwise and Combinatorial testing can dramatically improve the efficiency and effectiveness of both test design (identifying and document what to test) as well as test execution (the process of executing the test cases).
This presentation, by Justin Hunter, the founder of Hexawise, to members of TISQA, explains how these methods work, highlights empirical evidence that shows this method has been proven to more than double the number of defects found per tester hour in ten separate projects, and highlights a case study of a recent user of the Hexawise test design tool.
2. Intent of today
Not “this”
“This” (convincing you to buy my
(sharing effective methods) tool and give me money)
William G. Hunter (my dad, a professor of Larry Ellison (Oracle CEO): Prioritizes
applied statistics): Travelled to Nigeria for making boatloads of money through
a year with our family to teach / share his software sales (... “not that there’s
expertise of Design of Experiments with anything wrong with that”)
chemical engineering students there
because he thought it would help them do
their jobs more efficiently and effectively.
2
3. Topics
I. Welcome
II. Introduction / Executive Summary
-What challenges does combination testing solve?
-How does combination testing work?
-Where can it be used and what benefits does it deliver?
-What are critical success factors and general lessons learned from 2 dozen projects?
-Why isn’t this method of test design more well known and more widespread?
III. How the Proof of Concept Pilots Were Conducted
-Structure of Pilots
-“2 testers in a room w/ Hexawise vs. 4 testers in a room w/out Hexawise”
-Three hypotheses that were tested in each case study:
- No. 1 - Test design speed: savings of at least 30%
- No. 2 - Test execution speed: savings of at least 25%
- No. 3 - Cost of defect resolution: savings of at least 20%
IV. Blue Cross Blue Shield North Carolina Case Study -
- What kind of testing projects were chosen? Why?
- Findings for each of the three hypotheses - Were targeted savings achieved?
- Ease of use / change management - How easy was it to learn? ...to implement?
- Lessons learned - What went well? ... not so well? ... any surprises?
- Turning the “speed v. quality” dial - How will efficiencies created be used?
- Implications - Will you use Hexawise again? Why? Why not? Where?
3
4. Hexawise is a new test design tool. It is available
through a SaaS (software as a service) model.
Sign up for a free trial.
http://hexawise.com/users/new
(Only 4 required questions)
4
5. .
What vs. How
Treat these two topics separately for this presentation:
.
.
.
.
.
.
.
.
.
.
What Should be Tested? . How Should it be Tested?
.
.
.
• H/W Configurations . • What combinations should be
.
• S/W Configurations . tested together?
• User Types .
. • Test scripts in what order?
• Business Rules .
.
• Sequences of Pathways .
.
• Data .
• Features .
.
• Products Selected .
.
• Primary Actions .
.
• What Level of Detail .
.
... or anything else that seems relevant .
5
6. What vs. How - Implications
What Should be Tested? How Should it be Tested?
• H/W Configurations • What combinations should be
• S/WNo change
Configurations here: tested together?
• User Types
This should continue • Test scripts in what order?
• to be SME-driven.
Business Rules
We have a better way:
• What Level of Detail
• Sequences of Pathways Use Hexawise to determine
• Data what variables should be
tested when (and in what
• Features
combinations). Our
• Products Selected scientifically-proven
• Primary Actions optimization methods create
... or anything else that seems relevant better tests, faster.
6
7. Benefits Summary
Challenges Addressed:
Too long to identify and document test cases - (manual process)
Inefficient test cases - (redundant testing and missing holes)
High defect resolution costs - (defects found late)
Order of Magnitude Benefits Delivered:
Assuming: 100 testers @ $100 / hour
1,500 hours / year per tester spent on test design and test execution
~ $5 million / year in benefits
7
8. Hexawise solves a constant problem: “There are way too
many options to test everything. What should we test?”
...884,736 possible tests
x 5 options
x 2 options
x 2 options
...13,824 possible tests
6 browser choices x 2 options
x 4 options x 2 options
x 3 options x 4 options x 4 options
x 2 options x 4 options x 2 options
x 2 options = 884,736 possible tests... x 2 options
x 2 options x 2 options
x 4 options x 4 options
x 2 options x 2 options
x 3 options x 2 options
x 2 options x 2 options
x 2 options This single web page
could be tested with
= 13,824 possible tests...
72,477,573,120
possible tests
8
10. Next, users create tests that will cover interactions of
every valid pair of values in as few tests as possible.
(1) Browser = “Opera” tested with (2) View = “Satellite?” Covered.
(1) Mode of Transport = “Walk” tested with (2) Show Photos = “Y”? Covered.
(1) Avoid Toll Roads = “Y” tested with (2) Show Traffic = “Y (Live)” ? Covered.
(1) Browser = IE6 tested with (2) Distance in = KM and (3) Zoom in = “Y” ?
That is a 3-way interaction. It might not be covered in these 35 tests. See next page.
10
11. It also creates more thorough tests for all combinations
involving 3 values, as below, or 4, 5 or even 6 values.
(1) Browser = IE6 tested with (2) Distance in = KM and (3) Zoom in = “Y” ? Covered.
Any 3 valid values you can imagine? Yes, at least 1 of the 184 tests will cover all 3 together.
If even higher quality is desired, all possible 4, 5, or 6-way interactions could be tested for.
11
12. It also creates mixed-strength solutions. This results in
relatively more tests focused on selected priority areas.
(1) Browser = IE6 tested with (2) Distance in = KM and (3) Zoom in = “Y” ? Covered.
Any 3 valid values you can imagine? Yes, at least 1 of the 184 tests will cover all 3 together.
If even higher quality is desired, all possible 4, 5, or 6-way interactions could be tested for.
12
13. One of the advantages of this approach is that it creates
coverage data...
Every test plan has a finite number of valid combinations of parameter values (involving, in
this case, 2 parameter values). The chart below shows, at each point in the test plan, what
percentage of the total possible number of relevant combinations have been covered.
In this set of test cases, as in most, there is a significant decreasing marginal return.
% Coverage by Number of Tests
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
2 4 7 9 11 14 16 18 21 23 25 28
13
14. ... which is useful in determining “How much testing is
enough?” / “When should we stop testing?”
If you found three defects in this test plan’s
first 50 tests, you would find approximately
one more defect in the next 200 tests.
0 25 50 75 100 125 150 175 14200 225 250
15. Standard Benefits
Test Test Bug
Design Execution Fixing
Time Time Costs
Faster by at least Faster by at least Lower by at least
30% 25% 20%
15
16. Benefits Data
These benefits numbers are backed by empirical data.
Small proof of concept pilots can be done w/in 2 weeks.
Source: “Combinatorial Testing” IEEE Computer, August, 2009. Dr. Rick Kuhn, Dr. Raghu Kacker,
Dr. Jeff Lei, and Justin Hunter.
16
17. Benefits Data
~85% of bugs can be identified by 2-way solutions during
testing provided the tester inputs the defect-triggering
features or data ranges or configuration options.
Source: “Combinatorial Testing” IEEE
Computer, August, 2009. Dr. Rick Kuhn, Dr.
Raghu Kacker, Dr. Jeff Lei, and Justin Hunter.
17
18. Hexawise Characteristics
I. Easy to Use (3 simple screens, made for SW testers)
II. Powerful (2-way to 6-way)
III. Flexible (mixed strength coverage)
IV. Insightful (“charting for everyone”)
V. Customizable (for both training and integration)
VI. Easy to Collaborate (e.g., notes with questions)
18
19. Common Misperceptions
Change management is not always
straightforward.
Resistance to new ideas and different ways of
doing things is common.
Using pairwise and combinatorial test case
generators is no exception; there are many
objections raised by people hearing about it for
the first time, including the ones on the next
few slides...
19
20. SME Objections:
Chess
The Experts Reality
Kasparov and Karpov:
• No contest
• Computers will
never beat • Not even close
GrandMasters
• They lack the
necessary:
• Artistry
• Instincts
• Strategic insight
20
21. SME Objections:
Manufacturing
The Experts Reality (from the
(in the 50’s, 60’s, 80’s to current)
and 70’s...)
• The “Design of
Three phases: Experiments”
methods that
• It won’t work Hexawise employs
• It won’t work here are proven and
widely practiced
• “I thought of it first”
• Toyota: “An engineer
who does not
understand Design of
Source: Dr. George Box, applied statistician Experiments is not
(Has helped dozens of manufacturing companies an engineer.”
use Design of Experiments methods to handle
combinatorial explosions over a 30+ year career).
21
22. SME Objections:
Software Testing
Reality
The Experts
• Test design times
decrease by >30%
Three phases:
• Defects found per
• It won’t work tester hour often
• It won’t work here more than double
• “Oh wow.... It does • More defects are
work here.” found earlier
p. 95 • More defects are
found overall
22
23. Objection: “It can’t test
my ‘unique’ application.”
Belief / Misconception The Reality
We’re “different” because we have a: None of these make a difference. Hexawise
can (and does) deliver dramatic benefits
• Different industry or whenever you can’t (or don’t want to) test
every single possible use case.
• Different programming language or
• Different phase of testing or Hexawise is being used successfully by
hundreds of users in all kinds of different:
• Different type of application or
• Different type of defect we’re • Industries
looking for • Programming languages
• Phases of testing
• Types of applications
23
20
24. “Not all variables are = important”
Want to test some stuff “more”? 3 approaches:
1. (Blunt Approach) “IE7 is way more common than Opera and Safari”
“Clean”: add IE7 twice.
IE7 tested twice as much but number of required tests might increase.
“Cheat”: include “Opera or Safari” as a variable
But now: is Safari tested at least once with “No. of stops in trip = 3”? Maybe.
II. (Subtle Approach) “Car directions are more common than public transit
directions” or “Errors are more likely in XYZ new feature”
“Clean” solution: add each parameter in their order of importance
For parameters with a small number of values: “mission accomplished”
If Y/N: 60% of “Freebies” will automatically be “Y”, 40% will be “N”
For parameters with a small number of values “no freebies” / nothing changes
III. (Math Whiz Approach) “Interactions between these [5] features are
more likely to cause defects than with the rest of the variables”
See “mixed strength solutions” (slide 11).
24
25. Questions before case study
I. Welcome
II. Introduction / Executive Summary
-What challenges does combination testing solve?
- How does combination testing work?
- Where can it be used and what benefits does it deliver?
- What are critical success factors and general lessons learned from 2 dozen projects?
?
- Why isn’t this method of test design more well known and more widespread?
III. How the Proof of Concept Pilots Were Conducted
-Structure of Pilots
-“2 testers in a room w/ Hexawise vs. 4 testers in a room w/out Hexawise”
-Three hypotheses that were tested in each case study:
- No. 1 - Test design speed: savings of at least 30%
- No. 2 - Test execution speed: savings of at least 25%
- No. 3 - Cost of defect resolution: savings of at least 20%
IV. Blue Cross Blue Shield North Carolina Case Study -
- What kind of testing projects were chosen? Why?
- Findings for each of the three hypotheses - Were targeted savings achieved?
- Ease of use / change management - How easy was it to learn? ...to implement?
- Lessons learned - What went well? ... not so well? ... any surprises?
- Turning the “speed v. quality” dial - How will efficiencies created be used?
- Implications - Will you use Hexawise again? Why? Why not? Where?
25
26. Blue Cross Blue Shield NC -
Overview and Benefits
What are the benefits of Hexawise/combination testing...
...to the automated testing team?
1. Effective test data generation - Using a scientific approach to generate
data is more efficient. The combinatorial data gathered from the Hexawise
tool is far better than having a tester make an estimation based on past
experience. Our experience has shown that by using the tool we can
create test that have greater depth and breadth.
2. Having automatic generation of this data lends itself well to be adapted to
test automation. The tool outputs CSV files which can be imported into an
automation tool. Therefore fresh data can be at your disposal very quickly.
... to manual testers?
1. While we did not use the tool for this purpose the above benefits would
apply equally.
2. However, the fact that you can use the tool to reduce the amount of test
cases you run yet maintain a high amount of coverage is a benefit.
26
27. BCBSNC
What were the challenges with using Hexawise?
1. The integration of the data to the automated tests was not as easy
as we first thought. We had to develop some Excel macros to convert
the Hexawise data into data that we could plug directly in to our test
automation (i.e. - We had to convert ranges (output from
Hexawise ) into a single number within the range. Also, in one
particular case, we had to change the format because of the
complexity of the test we were doing. We could not just use the row/
column format output from Hexawise. But these development tasks
were one time tasks that we do not have to repeat each test cycle
therefore our next cycle will benefit from the investment in time.
2. Sometimes understanding how to capture certain types of data input
(or modeling) in the tool was challenging.
27
28. BCBSNC
What are lessons learned using Hexawise and/or other thoughts?
1. To ease the data integration an API would nice to have so that test
automation can further exploit the benefits of Hexawise.
2. It behooves the user of Hexawise to clearly understand how the
various inputs effect the SUT(System/Software under test). In our
case as automation engineers, we had to go back to the
SMEs (Subject Matter Experts) and consult them on the effects of the
data we were about to hit there system with. This exercise proved to
be eye opening to both us and them which resulted in better all
around tests (increased coverage/efficiency) being designed.
3. Having some documented scenarios on inputting data would have
been helpful. While most things are fairly obvious there are some
techniques that need to be learned in order to get the best results
from the tool.
28