Thursday, April 17, 2008

TESTING VS. EXERCISING

Are you taking the dog for a walk or making him jump through hoops?



I see a significant difference between testing code and exercising code. If someone sits me down at a PC and tells me how to access a given app, I can immediately begin to exercise the code. I can move through the screens, try various options, etc.

But is that testing?

Not really.

I consider that exercising the code. I might, as I move through the options and try different actions, run into something that APPEARS TO ME to be an anomaly – an error of some kind. I can write that up. But the things I find that appear to be incorrect will be based on my own experience and opinion as to proper system operation. I have over 25 years of experience, so chances are pretty good I’ll “guess” right. But what if I have only one or two years of experience? Chances are I’ll miss a lot of errors because they look OK to me (or I never tried certain scenarios) and/or I’ll write up a bunch of errors that aren’t really errors. This is, to my mind, the worst type of “exploratory testing”. It isn’t “testing” and I feel it drags out testing sessions unnecessarily, as someone relatively clueless tries to find out what “should” be happening.

Testing implies a comparison of to . Car analogies are a bit passé, but consider if you were “testing” the thickness of the outer shell of a car door. There are federal standards for this. What if you didn’t know what the standards were? What would you be able to determine? Well, you could determine the thickness of the outer shell. But you wouldn’t know if it was good or bad. Maybe you’d have a personal opinion as to good/bad, but you wouldn’t know if that test “passed” and would meet federal standards.

Performance/load/stress testing is interesting in this way. You might get a requirement that says at peak load of 2000 users, response time must remain at under 4 seconds. That is something measurable; usually it can be tested. But you can get other types of requests to just start at and add 100 users until the team “finds out when it breaks”. “Breaks” may mean anything – timeouts, lockout, crashes, etc. Then the project team looks at the results and decides if the breaking point looks acceptable. The dangerous thing about this type of exercise is that the team can guess wrong. Many times they’ll make their decision as to what is acceptable based on wishful thinking or optimum conditions. Or even time left in the schedule to tune. I’ve watched project teams ignore results (logically trying to justify “why” results were bad in QA) because the QA system did not match production (more power in production), only to have exactly the same catastrophic events occur within one hour of going live. That same type of risk is inherent in any type of “exercising” vs “testing” scenario. You’re guessing. Some people are good guessers and some are bad guessers.

Say I’m given a new website to “test”. I have no specifications to speak of; maybe a few Emails. I bring it up and the color scheme is puce, orange, and violet. All of the submit buttons contain little happy faces. Now because I’ve been in the field for a while, I’m probably going to go back and ask if that’s REALLY what was desired. I’m also going to make my professional opinion clear. But if you give that to someone with little experience, they might just blink a few times, and start exercising features. In other words, they will ignore what they are unable to definitively view as an error. They will focus only on items that their own limited experience tells them is absolutely wrong. If you’re working with an off-shore team, they would also be reluctant to tell a client their color scheme appeared to be vile and to ask questions.

I believe that in order to “test” something, you must understand and have knowledge of what the system is supposed to do. You should have some understanding of what it should look like and how it should behave. You can then design tests that verify it does what it is supposed to do, looks the way it should look, etc. It then becomes significantly more apparent what type of tests you need to run in order to ensure you can’t make the system do what it is NOT supposed to do.

Structured testing is a logical progression. Unstructured exercising of code is all over the map. Structured testing finds a significant number of errors. Unstructured exercising might find 10%, and is based on luck more than skill.

Here’s the “rub”. Exercising code is fun and easy. Testing code requires thought, intelligence, and a skill set.

I believe that without specifications of some kind – format doesn’t matter – using the term “testing” is specious.

We have many common issues in this field and they haven’t changed much in over 20 years. One of those is poor specifications. The format and methods for arriving at specifications have changed, but one of the major problems most QA staff cite when discussing their jobs is poor specifications. It affects everyone – development, QA, the end user, and on and on. Experienced QA staff learn to ask questions and find the answers they need. But not everyone is experienced – either on the QA team, development team, or downstream groups. So what can be done?

Well, mentoring and training of your own QA staff is first on the list. A tester afraid to ask questions is inevitably a bad tester. An untrained tester won’t find as much as a trained tester. Every member of my staff is required to go through a 3-day bootcamp to ensure we’re all on the same page. That means we all use the same terminology, and everyone has been trained on the same basic test techniques in the same way. This doesn’t mean we’re locked into ONLY those techniques, but it does ensure everyone on our staff is trained and expected to be familiar with the basics. Everyone in our field is so very dependent on specification of some kind, I’d advise every QA professional in the field to start supporting and promoting the importance of the BA (or whomever writes specs in your firm) and the criticality of specification to the smooth, on-time delivery of a product your end users really want. Are you on an XP project? What are the user stories like? How do they progress in level of detail?

I realize users change their mind and yes, I realize requirements of any kind can change significantly during the course of a project. I realize that just the word “specification” is an anathema to some people. But format of specifications is immaterial, and even an informal change approval process can solve that problem. The point is that the project team needs to know what to deliver and the test team needs to know what they are testing against. And they need to know with sufficient time left in the schedule to do their jobs the right way. If you do not have anything to test against, you’re just walking the dog – exercising the code hoping to stumble over something important. Anyone can walk a dog. Does it make you crazy when you hear developers, BAs, etc. tell you they’ve done QA? Every fool who has ever walked through a website or clicked on a button thinks they know how to test. Are they right? Is that all you do? When I’m feeling Evil and I get that type of comment, I ask them where their test artifacts are located (blank stare). I ask them about their favorite test technique – equivalence partitioning (blank stare)? Paired testing (blank stare)? By then they get the idea and I can just laugh and say “Everyone tells me that.”.

My blog here is not going to magically change The Way Things Are. There will still be applications “thrown over the wall”, bad specs (or no specs) will still abound, and some testers will still meander around in new code, totally clueless sightseers, tripping over a bug now and then out of sheer luck.
But if this makes you think – even for a few minutes – about what you supposedly “test”, the QA world will become a better place for a few shining moments… I sincerely hope all of you champion improvement and positive change. It’s part of our mission. I feel a Blues Brothers song coming on….

Tuesday, April 15, 2008

TESTING MYTH-TAKES - PART THREE

“IT ISN’T A BUG UNLESS IT BUGS SOMEONE”


My apologies to the originator is this quote; I’m willing to give credit where it is due if someone lets me know where this originated. It now (sadly) gets quoted all the time.

Let’s talk about this for while. It may end up being a question like “If a tree falls in the woods and no one hears it, does it make a sound?”.

I’m one of those people that says “Yes, sound is made whether something is around to hear it or not”.

Generally speaking, I’m not especially philosophical. It’s just a truth – neither good nor bad. I get impatient contemplating or debating stuff that either doesn’t have an answer or the answer doesn’t matter much in the Grand Scheme of Things.

But certain statements immediately set off some weird switch in my head; for lack of a better description, I’m going to have to refer to it as my internal “bullshit meter”. This statement hit it right away; my entire cranium cavity was reverberating with the equivalent of air raid sirens.

OK, let’s say I find an error. The nature of the error is that the color and size of a given icon is incorrect. Functionality is not affected. The user is willing to take the system as it is.

Is it a bug?

Oh yeah. It’s a bug all right. Unless the user comes back and says they PREFER it that way, all they’re doing is saying “I can live with this bug right now”.

OK. Say I find another bug. The problem is that an internal app that the user never sees is using up twice the amount of cache as normal. There’s plenty of space right now, the user doesn’t care about it, and fixing it is not a priority.

Is it a bug?

Oh yeah. If your company continues to make bad decisions of this kind and lets this type of bug go, eventually your app is going to tank. To quote a famous movie, “Maybe not today, and maybe not tomorrow, but soon….”.

Let’s take a look at a successful company with a relatively successful testing organization and a good history of giving their end users what they want.

This company reviewed every bug that was found during the testing period and included the end user in decisions as to whether to fix or not fix each bug. Since their testing periods were relatively brief, they normally fixed every urgent or serious bug, deferring less important bugs to future migrations. What actually happened, however, was those deferred bugs might or might not have ever been fixed, depending on their priority and severity and staff available. It was normal for development staff to ignore anything they didn’t personally feel was patchworthy.

The company in question might have fixed 250 bugs and deferred 45 “cosmetic” fixes. For every migration.

Over time, what happened was that certain areas of certain applications became “fragile”. That means there were so many little bugs everywhere that touching anything broke something else. There were areas of the code that developers hated to even look at. The users had given up on even reporting smaller errors, regardless of how much they “bugged” them, because they knew they would never get fixed. A few areas of the company had even committed their tribal knowledge of “how to get around” errors to procedures manuals. Imagine writing procedures manuals on work-arounds for bugs!

What eventually happens in such cases is that applications become so unstable they require a complete rewrite. But if the original habits never change - if nothing but urgent and critical bugs are “cared about”, the same situation is going to occur with the new code.

My point is here is that bugs don’t have to “bug” everyone. It’s a bug regardless of whether someone “cares” about it at that time. Have any of you ever worked for a company that regularly tries to “talk the user out of” a bug? I’ve worked for several. This means strong-willed individuals can actually change the behavior of less assertive individuals to the point they won’t even bother to tell them what “bugs” them. The normal scenario here is development dictating what the user wants or needs. Are you going to put your company’s future into the hands of a 28-year old developer who has as much in common with your users as a rock does to a porpoise? I’m not particularly moved if Binky BadCode thinks something isn’t important. Binky hasn’t been in the field long enough to know what is important or to have seen the fallout from bad decisions. I’ve had prod support managers in defect review meetings talk about deferring bugs that “aren’t important” and in almost the same breath complain about the number of bugs found in production. There’s a link between the two.

How about those of you who work in shops where if the user doesn’t report it, it isn’t a bug worth fixing? It doesn’t matter that by the time the user finds it, it will cost 200 times more to fix or may have corrupted something else. What a waste of time and money!

Proper care and treatment of bugs should be addressed in order to protect the viability and future health of your organization.

Can a bad situation like the above, where the apps become fragile over time be halted or reversed? Yes. The team can change their Evil Ways and set some quality standards (I can see a lot of people wince over the word “standards” from here…). Standards are not always bad and they can help turn around a problem. I’m not suggesting the same standards for every company. Maybe you want to state that you cannot break existing code and anything that breaks existing code must be fixed before production. Maybe you want to set an 80/20 rule for new projects – 80% of all defects have been fixed and the remaining 20% are non-critical. Maybe you want to say that all defects found in a migration are fixed within 3 months of production in severity order. A variety of defect management techniques can help; what solves your issues will depend on your environment.

Do I realize some shops don’t have the staff or time to fix everything? Yes, of course I do. I don’t live on Planet Bizarro. I understand the reasons, financial and otherwise. They may be valid for a given migration. Long-term, however, they’re a myth. A Very Expensive Myth. I like the concept of “technical debt”. Google it and see what you think.

Overall, what I’m saying here is that you’re in a TESTING field, dammit. Have you stopped caring about some bugs? Then you’ve lost your edge. If QA doesn’t push for improved/better quality, then there won’t ever BE improved/better quality. You HAVE to care and that means you have to push to get things fixed. Not in an obnoxious way and not to the level of becoming a bottleneck, but there has to be someone who champions excellence and most executive managers are expecting that someone to work in QA. One of the saddest things I see in our field is people who have become ineffective due to complacency or who have fed a downward spiral for their own company through sheer apathy. I’ve known analysts that don’t even write or execute tests any more for things they “know” development won’t care about.

Argh.

Listen, if there’s a bug in the tree and it falls in the woods…….

Well, you understand what I mean.