Scientifics interpret the same notions in terms of Errors:
Error type I – false positive (false alarm) – when reported defect is invalid.
Error type II – false negative (missing) – when defect was not revealed in testing.
See more details on this theory on Wikipedia [http://en.wikipedia.org/wiki/Type_I_and_type_II_errors]
There are special statistical measures which operate with those notions (see below an excerpt from Wikipedia.org)
Full details on that see here: http://en.wikipedia.org/wiki/Sensitivity_(tests)
So one-two metrics can be employed for both manual and automated testing analysis in time (for instance showing up in QA reports as curves on a graph). I would select Accuracy (ACC), Precision (http://en.wikipedia.org/wiki/Accuracy) and False discovery rate (FDR).
For automated testing, I would suggest to measure against manual testing considering test automation coverage but not overall (manual+UAT+Production), so that false positive (false alarm) is when test automation reveals wrong defect, false negative – when automated test did not find defect when expected due to auto test covers that functionality.
To test your auto test :), you may run pilot analysis using so called defect seeding technique.


2 comments:
Hi
Sounds good technique to test effectiveness but i think its hard to make it using in practical scenario.
Thanks
Manish Sharma
Manish,
I don't think so difficult to incorporate this analysis. Basically it's restricted by defect tracking system in use. If you can filter out defects found by auto tests, then it's easy to count number of rejected. Regarding not-found - this is a matter of test manager (perhaps they may rely on manual testing members) which should clearly know current test automation coverage and track each new found ticket to realize whether test automation might found that one or not.
Thus as manager you may run competition on missed bugs by automation. Good practice and simple.
Post a Comment