Glorious dawn

Posted via email from iamjustin’s posterous

Are we done yet?

What it means to be done

This seems to be a somewhat misunderstood concept in the world of software. Most often, I see people attempting to quantify a product as “done” by using open various types of defect reports, statistics on runs of test cases, status of unit tests or various other types of automated testing.

I really like the way hospitals use a pain chart for administration of pain medication. It is a great illustration of what I am getting at here. They measure through illustrating emotion with the face, compare pain levels against something less and something more, show a description and also a 0 to 10 scale. This shows how multi dimensional something like pain is.

painChart

This is a good illustration showing can not be quantified (pain) but can be measured. Most hospitals in the US use this chart fairly successfully as a way to determine how much pain medication to administer. Since pain isn’t something that can be quantified, the nurses have to base their decision on what the patient says they feel.

Just to be clear(ish) here, by done I mean ready to ship. You don’t have to do anything else but stick the tar ball on FTP and send out an announcement. I am clarifying because I have Seen terms such as “done-done” thrown about (meaning code is checked in and tested) as if one done is meaningful without the other.

Done is not
• a number
• a metric
• a quantification
• a point in time

Done is
• a judgment
• a feeling
• a comparison

Done means that you think you know enough about what is going on with your product to say that the people that matter will be able to get the value they seek. Though you will always know more than you do today, you do have to make a judgment at some point and basically say “I know enough”.

Now I understand that it is easy to sit on an internet soapbox and write about what something means, but in practice things are far more complex. I know, oh I know.

Dead weather in new orleans

Posted via email from iamjustin’s posterous

How would you test this? – Take II

This is my solution to Mat Heusser’s How would you test this? – Take II

Strategy:

For the first pass of testing, I would use a form of session based test management to create a set of charters or themes that the testing can be grouped into. Each theme is allocated some amount of time for exploration. This time allocation is important for risk management. If nothing interesting is happening in the allocated time box, it may be best that the tester either reevaluate his/her strategy or move on to the next charter. The themes include a set of test ideas that may be important to look at. The test ideas are a guide though, not a strict test plan. If something interesting and important is happening the tester should pursue that path.

Things that can be easily validated by a computer (the state of a button at some point in time for example) could be automated given the time. This would benefit future regression testing efforts as well as facilitate testing on other supported browsers to some extent.

Charter:
Explore functionality of the filters on the Activity widget.
Some important things to note are:
performance on filtering large sets of data
correct behavior when filtering with multiple filter criteria

Duration:
1 hour (assuming there is already data existing to facilitate testing)

Notes:

  • Current known filters and assumptions are:
  • Post Signals To
  • Defaults if only one is available
  • Showing
  • From
  • People I follow not present if I am not following anyone
  • Within
  • Defaults if I am only in one network
  • Caching of selections?
  • If user leaves widget do previous configured filters remain?
  • A few things to test
  • Single selection of filters
  • Multi selection of filters on single filter (‘showing’ filter for example) and then on multiple filters at time. Basically test one, some and all.

Charter:
Explore around the theme of posting to the Activities widget from the post box as well as auto posting from page updates.
Some questions:
Is this application I18N compliant? Do characters display the same in the post widget and in the posted activities area?
Are dates and numbers correctly formatted?

Duration:
2 hours

Notes:

  • Manual posting from activities widget
  • Auto-post from page edit
  • Post date / time stamp
  • Date time stamp based on user locale or server time?
  • Date / time stamp formatting?
  • When does date / time stamp switch from minutes to hours / day / date stamp?
  • Char count decrements when char is entered (whitespace counts)
  • Special and multi-byte characters display consistently in post field and
  • Post to appropriate audience based on ‘Post Signals to’ selection
  • Can post to one or more network? (multi-select in Post Signals to?)
  • Expand / collapse signal post pane

Charter:
Explore the theme of functionality around posts that exist within the Activities widget.

Duration:
2 hours

Notes:

  • Online notification
  • Reply
  • Direct message
  • Delete
  • Auto-post from page edit
  • What happens when I delete a message (or auto-update) I posted?
  • Other users screen refresh to remove post?
  • How frequently does polling occur for the online notification icon to go on / off
  • Reply shows at top of list when posted?
  • Private message does not appear in thread
  • Insert account link: link can be clicked to nav to account when in thread or in post field
  • Auto-posts have pencil icon (to indicate they are associated with a page edit)
  • Auto-post do not have:
  • Auto-post has link to edited page
  • Auto-post is posted by user ‘auto builder’

Charter:
How does pagination work on the activities widget when there is 0, 1 and more than one page of posts? How does the appearance of new posts effect pagination?

Duration:
1 hour

Notes:

  • Newer
  • Newest
  • Older
  • If only one page of posts is available, nav is disabled
  • If I am on page 1, Newest and Newer are disabled
  • If I am on last page, Older is disabled
  • older moves to previous page
  • newer moves to next page
  • how does auto refresh work with pagination?
  • Example: I am on page 2, new messages are posted such that messages on page 1 now belong on page 2. does the page I am on refresh?

Charter:
RSS feed

Duration:
1 hour

Notes:

  • Update frequency
  • reader compatibility (Google reader is most important)
  • filter config in activity widget apply to feed
  • RSS 1.0, 2.0, ATOM compliant

Charter:
Activity Widget tool menu (monkey wrench icon)

Duration:
30 min

Notes:

  • Activities viewed / page (5,10,15,20,25)

Some test ideas from Matt Heusser’s “How would you test this?” blog

This is not a trivial piece of software to test given your overview of pages, workspaces and networks and activities widgets.
Given that this we are using some flavor of Agile:
What was developed for this iteration?

MH [Great questions, Justin. In reality, in the first iteration, we didn't have the next/previous links at the bottom, or the reply icons at right, and you couldn't send signals. All that was developed in the second iteration. If you want to exclude that from your testing, that's reasonable. But I thought I would crank it to eleven, ya know? :-) ]

Is acceptance criteria defined for what was developed?

MH [If you'd like, I can play the role of the product owner and we can collaboratively develop those acceptance criteria. For now, let's say no.]

How much time do(es) the tester(s) have to do their thing?

MH [I would think you could do a decent 1st pass at testing this, in our four supported browsers, in three hours. Maybe add an extra hour if you want do document your scenarios and do a great job brainstorming. Do you agree?]

JR [I can test for this amount of time and see what comes up. If something interesting happens and the person responsible is convinced by what I tell them then maybe it will make sense to spend more time testing]

What are the consequences of going over that time?

MH [After about six hours of testing, you're going to have to work a little modest overtime, but basically, it's a whole team concept. Nobody's going to get fired if you want more time to do a through job. The problem is going to be if you want it every iteration.]

Are there any descriptions (business scenarios / use cases) of how what was developed will be used?

MH [I’m sure there are whitepapers out there. You can google around for ‘Sociatext’, here’s a link that might be helpful. If you have your own business or domain name, you can go tohttp://www.socialtext.com and sign up for a free account for 50 users and try it out – or even run your own test scenarios! This is also a chance for you to try out designing a test strategy for a company developing speculative products for an emerging market.

I hope that helps.]

Justin Rohrman

Simple test plan:

Activities Widget

  • filtering
    • showing
    • from
    • within
  • posting
    • signal
    • edit
    • comment
    • create page notification
    • edit page notification
  • navigation
    • newer
    • newest
    • older
  • Post view
    • special characters
    • multi-byte characters
    • white space
    • RSS feed
  • existing post functionality
    • View user profile
    • reply
    • direct message
    • delete
    • online notification
    • in edit notification

This test plan is obviously missing a lot but the general idea is to take the key features of this product and explore them over the given amount of time. If something interesting occurs when testing, a note will be made and this specific thing can be discussed in the context of the issue and also in the context of will it be worth while to test more on this thing. The idea is to gain as much meaningful information in short amount of time since I do have an idea of what makes this product useful (because of previous experience using various wiki products) but do not know about specific release criteria for this product or what would make a feature ‘good enough’ to its users.

In regards to testing on multiple browsers, I would assume that this can be done effectively after a talk with some sort of customer rep to find out what is most important and a talk with developers and testers in the know to find out what is most risky. Some of these things may be good candidates for automated testing given the time.

Kata for testers

What is a kata?
Kata is a Japanese word used to describe a specific set of movements that are practiced in repetition. Kata has roots in martial arts such as Aikido and Kendo where a choreographed set of movements is the dominate form of training and practice for new practitioners. The philosophy of this repetition of movement is not to move toward a successful moment, but a continuous improvement of your ability to the point that you eventually perform an action without conscious thought.

Inspired by Robert Martin’s tweets about kata he uses, I thought of a few kata that I use to train and improve myself as a tester. There are many of course, but these are the ones that I practice frequently. If you have some that you have found to be beneficial, please mention them to me!

Chess

Heuristic: Look into my crystal ball

A good chess player has the ability to predict their opponents next few moves based on the piece just played. A few years back when I was working at coffee shops, I was fortunate to get to play with great players almost daily. Unfortunately we all fell out of touch and I don’t play too often any more. Yeah yeah, I know, there is chess on the internet. It just isn’t the same as playing in person but maybe I will give in and try a few games on chess.com.

Anyhew. Chess will not give you the magical ability to predict an important failure in a piece of software but, it does prove that it is possible to do so with practice. Experience will give you a catalog of important patterns that you can use for future discoveries.

Sudoku

Heuristic: There is more that one way to skin a cat

Sudoku has become a pretty popular game over the past few years; I think there is a regular Sudoku puzzle in the NY Times right along with the traditional crossword puzzle. In the easier versions of this game, you may only need to use a single method to complete the puzzle, making 1 – 9 on a column or row for example. On the more difficult puzzles you have to recognize several different criteria to fill out a single cell. You may need to check that you have 1 – 9 on the row or column as well as within a single grid along with other possible criteria.

To parallel this to testing, you may need to recognize several different things happening simultaneously in your system to notice that something important is happening.

Scrabble

Heuristic: Focus and defocus

This is how I play scrabble, especially in the middle and later in a game; other people probably have other methods they prefer. I use the focus and defocus heuristic (credit to James Bach) to find new possible word combinations that might be lurking in those many letter tiles laid out on the board.

Looking closely by pinpointing my eyes in on a certain area on the board then slowly expanding your viewing area allows me to find interesting patterns that exist in the already played tiles. Sometimes this pays off and I get a high scoring word, sometimes I find that the pattern is insignificant and I focus and defocus again.

Role Playing Games

Heuristic: Oh, excuse me; I’m just not myself today

Role playing games are lots of fun. I prefer futuristic, cyberpunk styled games like Rifts and Shadowrun but they have played a fair share of D&D too. You get to completely become another person for a few hours on game night. You think like that character, you speak and interact with your environment like that character would and you solve problems like that character would.

The ability to do this in the testing world is paramount. The ability to become another person for a little bit and see the software as that person gives fresh perspective. Something that seems innocuous to you from a technical point of view being the expert tester that you are might be the most annoying thing in the world to the 65 year old lady that has to use your software every day for her job. Become someone else for a day, you might see something interesting.

I am looking for a new group to game with, so if you are in the Houston area and stumble across my post, send me a message!

Medusa the Benevolent

Posted via email from iamjustin’s posterous

More thoughts on estimation

Software testing is similar to investigative journalism in more than a few ways. The journalist and tester immerse themselves in an environment to interact and learn from experience to be able to tell the tale of their adventures and exploits at some point in the future. Software testing and investigative journalism are similar in the sense that they are not construction. There is no finished product. The end result of both of these activities is knowledge.

The Tale
When telling the tale, the journalist begins with the big, bold events that people like to talk about and other people like to listen to accounts of. Natural disasters fall into this category. This time last year, I was still dealing with the aftermath of Hurricane Ike. Days after the event, accounts of Ike were being told. Days after the hurricane ended, accounts were being told day in and day out. If you were luck enough to have electricity, this is all you had the privilege of listening to.

After large obvious, easy to analyze events, there are the events often considered equally large and important but are more difficult to analyze and people are not as eager to discuss. Casualty count of war is a really big deal; you will be hard pressed to see this discussed on the evening news though. It generates dissent in the population and creates a lot of questions as to whether what is being fought for is really worth the cost.

From there people move on down the list of things they encountered until reporting on things like dag weddings and days the local Girl Scout group will be selling cookies.

I’ll leave the drawing of parallels to the reader, there are many examples relating to software investigation and reporting that quickly come to mind.

The Journalist
The journalist telling the tale is investigating for someone that wants to know about what is going on in the world (Wow! Not unlike software testing!). The people that want to know, want to know by a certain date so that they can process, edit and share the information. The journalist is not working leisurely; s/he is working rapidly to discover everything thought of as important to the theme under this time limit.

Often, once the investigation has started the investigator will see a series of events and think “Wow, the information I know right know is very superficial. There has got to be something else going on here. I need more time to figure it out!”

More time…more time…

More time means more money of course, so someone has to be willing to continue paying for this activity. So being the savvy businessman that s/he is, the journalist pulls a number from his butt and goes back and says “I need another month! Then I can know this and this and that”. The person he is reporting to says “Hell no, I’m not paying you for that long. This story has to go out. You have a week.”

The Negotiation
Yes, this example is ridiculously simplistic. But it is a fairly good representation of what a tester goes through an iteration of development.

Think about this. Before going on an assignment, the investigator is shown a single picture of the scenery he will be in. the picture is not the real scenery though, it was drawn up by someone who thinks this might be what the scenery will look like when you get there. The journalist is expected to estimate how long it will take her/him to gather all the information most worth knowing.

I agree, absurd.

My Review
One thing is clear, there is information here that is unknown and it is worth a substantial amount of money to spend time figuring out what there is to know. Also, the question of wanting to know how long it will take to acquire this information is a very reasonable thing to ask.

Based on the assumption that there is obviously something here worth knowing, why not spend some time to do a short ‘pre-investigation’ and get a better idea of what is going on before giving an estimate?

Today is the most ignorant day of the rest of my life. I will always know more tomorrow than I think I know today.

Though doing this will not allow you to create an accurate estimate, it will be more accurate than what you have given with out the pre-investigation simply because you know more.

Armand Bayou Nature Center

Posted via email from iamjustin’s posterous

On estimation of testing activities

I, like many others am being asked for an estimation of time on how long it will take to complete testing activities for each thing I am testing. In the past, the estimations were made and were not used for anything except to mark velocity on a burn down chart. Naturally the line for programmers and testers were widely different and the question of ‘how can we bring these together’ was frequently the topic of conversation. My answer was for programmers to develop in small testable chunks that someone can test simultaneous to the development of the next chunk (testable thing). Well, this never truly caught on and the lines remained far apart and at some point I was no longer asked for estimations.

I do not having a problem with making estimations, my issue is that the tester will be forced to commit to her / his estimation and be held accountable to disparity between the estimation and the actual time required when that person is in very little control of the disparity.

The argument that test estimation should be done is that ‘if programmers can estimate, then so can testers’. Creating a piece of software is a finite task, you stop when what you have built satisfies the customer. Testing is a service to provide information about the current state of a piece of software. In my opinion, the tester is done when the information they are providing or the artifacts of their activity stop being meaningful to the person representing the customer. Or in other words, it doesn’t give them any additional insight into their questions about the product. This means we are in effect being told to estimate when a certain emotional response will be elicited.

This might be possible assuming that the things a person was testing were received in a similar state each time and everyone had the exact same understanding of what was being created and that understanding is equal to the customers’ desires. As we are human and fallible, this is very unlikely to happen and so estimation becomes very haphazard and inaccurate.

A scenario: A tester estimates a feature will take 8 hours to test. The tester receives the feature and begins testing, noting several issues along the way. Some issues are fixed that cause regression in parts of the feature that previously worked. Over a period of 2 weeks the product manager checks in periodically to see the state of the feature and each time says the feature is not ready. Has the tester missed the estimation?

I think that Michael Bolton’s assertion that estimation for a tester is really a negotiation is the most correct way to address this problem.

So, I will make estimates to the best of my abilities. But I do not understand what purpose they will serve and I do not want to be held accountable in any way for the disparity between the estimate and the actual.