Saturday, March 16, 2013

On the Stoplight Heuristic, and 'Mindless' TDD and Test Automation

I read an interesting article this morning about TDD, and one possible outcome where developers are so focused at keeping things 'green' that they may get a bias around their code which drives them to jump to alter the code to make the tests green.  It's an interesting read, and a cautionary warning to developers who may too easily fall into the Test Driven Development groove.  Please read it before making the jump

This is a risk I've been keenly aware of in my role as a Software Developer in Test, but it applies equally to Developers whose main priority is line code for applications.  Let's consider for a moment, what are the three main statuses we typically see when any automation, be it unit, GUI driven, API, or integration type tests that run and report.  It's that stoplight heuristic, Red, Yellow, and Green.   

A lot of people strive to keep things Green, because Green is good, it means you can push forward with more work, it means the tests (or automated checks if you've read some of Michael Bolton's work) are 'passing', and that's a Good thing right?  However, that Green could be a false positive.  A false positive occurs, when something passes in a test, because, it either isn't checking the right thing, or the right attribute of something in the software, when in fact there is perhaps some defect there that lurks unknown, because we did not think to test for it.

False Positives are tricky, and hard to test for, because in a fast paced agile world, the pressure to keep pace with your team, to keep progressing the software often puts us into something of a 'cruise control' sense of development, where we're propelled forward at the same velocity, sometimes, without fully considering the implications of any new feature. What's more, something could have failed during tear down in your cleanup or after hooks, and it may have failed silently, giving you no indication that anything was wrong, since the 'expectations' or 'assertions' passed and the test is reported as green.

Yellow in the stoplight heuristic, often, means inconclusive, or pending.  It can be used to signify work that's in progress, but not completed.  Or it could be as simple as being marked as pending in RSpec.  Yellow readouts can be good, because they remind us we still have work on any given test scenario or example.  However, dealing with Yellow, also has its risks.  It's easy to delete or comment out the 'pending' line that generates the pending or inconclusive check report.  It might even let the test be green.  Watch out though, with RSpec, an example can be green, and not have really asserted, (expected) anything.  Thus if no errors are thrown, you might have an example which turns green, yet you've not really evaluated anything at all.  

Then there is Red, the bane of CI/CD systems.  We don't like to see Red.  Like a bull, we often want to rush to attack fixing the issue.  But it's important to try to understand why a given check failed first, before making a decision to add a bug fix to the underlying code, or a maintenance story for the test to your backlog.  Sometimes Red means, a change we were waiting for has finally happened, so we must now test this feature differently.   Sometimes during a sprint, you may expect a change in a test, and wait for it to happen, before merging in a branch of Tests which will now test that new thing.  Other times it may mean that the test caught a legitimate failure, or something odd went on in the environment.

Then there are False Negatives.  Tests which sometimes fail, not because the test is wrong, or because the application failed, but because some assumption we used when writing the tests was not true, this one instance when the tests ran.   This implies a number of different things.   Was the data we ran against clean?  Did something change or go out of scope in the course of running the test?  Or as, I have personally experienced, maybe it's a timing issue.  We love that Automation at times can be very fast, but sometimes, it may be faster than the underlying system, and a race condition ensues.  Perhaps in this case we should poll for a brief time to see if the status changes to a correct one.

We as Testers, Software developers, and Test Engineers must be careful when chasing after the Green bar.  The readout of a stoplight is just one thing we can see when looking ahead at our software, but we must not forget that Green simply means, what we checked passed.

I've always understood that green, doesn't mean, zero defects are present in the application.  It means nothing we checked acted out of the manner in which we asserted parts of the code. Unfortunately, even a red status does not always mean there's a failure in the code. An open/shared environment can quickly become 'unclean' and a failure could result because of something, someone else did, which your tests couldn't account for at the time.

Let's also remember that we can't guarantee defect free software, because there will always exist the possibility that there's something we miss checking, or some variable in the system which we might fail to take into account when we do testing or write automation checks. That's why you'll never take the human being out of testing for evolving systems, because the ability to analyze and dig deeper is something that no machine can do as effectively as the human brain, and even the best Test Automation, like any good software, will only test or check, what you tell it to check.


  1. Agreed. I'll condense this down and generalise to "Understand the leaks in your abstractions". If you understand that "green" doesn't equal "works", and "test passed" does not mean "no problems", and "test cases" does not mean "coverage" then you'll avoid a lot of pitfalls.

  2. Yes, I'd agree with that, Chris. My intent with this post, was to expand on the understanding of what the colors really mean.

  3. Hi Timothy,

    How are you? I came across your blog and I was wondering if you would be interested in guest blogging on

    In case you are unaware, TEST Huddle is a software testing community that was launched by EuroSTAR Conferences back in early 2014 and there has been steady growth of members ever since. Today we are proud to say that we have over 2500 members and counting.

    Adding a blog post to TEST Huddle is easy as we have an upload resource option available on the site. You can upload here:

    The sooner you upload your blog, the sooner we could add it to the blog schedule.

    I look forward to hearing from you,

    Kind regards,