Test your navigation with a reverse card sort

Even with a good content management system it can be hard to re-arrange stuff after you’ve gone live. Take the time to test out your proposed navigation with a reverse card sort to quickly iterate to a working model.

A reverse card sort uses index cards as a stand-in for your site’s menu system. It lets you test the menu system quickly and cheaply without having to code anything. The reverse sort (also called “tree testing”) gives you a good sense as to why users can or can’t find items. Changes can be made in near real-time.

Card sorting lets you see how users group the information and tasks on your site. You can use this as a tool to develop your information architecture, but it’s still worth testing that the arrangement you came up with is what users intended when they made those original piles of cards.

Reversing the sort – by using the navigation system you developed from the information architecture to let people complete tasks – gives you the feedback you need to optimize the menu structure.

Overview

Groups of three participants use navigation cards placed on the table by a moderator to find where they’d go to complete each of a set of tasks they are given.

Participants have a stack of index cards with a task written on each card. These could be the same tasks as were used for the initial card sort. Participants work together to discuss where they think they’d find the place they could complete each task, and write down the path they take, including dead ends and backtracks, on the card. Notice the navigation cards in the photo have a number associated with each menu item, and those numbers have been written on the task card (at the bottom of the picture) by one of the participants.

The moderator places and removes the lower-level navigation cards in response to where the participants say they would “click” on the higher-level navigation cards . This simulates the menu system on your site or in your app. If participants choose a different higher-level item, the moderator removes the cards for the existing item before placing the card for the new higher-level item, much like would happen when an online hierarchical menu displays.

Participants have four choices for each task card.

If they think they found the correct place to complete the task, they can place the task card on the “Found it” stack.
If there was an element of confusion or disagreement between participants, they write a description of what caused the problem and then place the card on the “Confusion” stack.
Wording of the task or of the navigation items might throw participants off. If that’s the case, the card goes on the “Terminology” stack, with a written note suggesting better terms.
If participants can’t find where to complete the task, they can place it on the “Give up” stack. Some groups of participants prefer this to be called the “try again later” stack.

The moderator doesn’t confirm whether participants did or didn’t find the right place, regardless of which stack they put the task card in when they are done. What’s important is where participants think the task is completed, not where the team thinks it should be.

What goes on the cards?

There are two sets of cards – the navigation deck and the task deck.

The task deck is made of cards that each have one task written on them, along with an identifying code (used for analysis). They can be made the same way as cards for a regular card sort. Your participants will run out of steam after about 45 cards or one hour, whichever comes first.
The navigation deck has a card for each level of the menu hierarchy and a set of four cards for the task cards to be placed on when the participants are done with them. I use labels of “Found it”, “Confusion”, “Terminology”, and “Give up”. You can, of course, make whatever changes you want. These labels seem to generate the most conversation, however.

I like to label each of the menu hierarchy cards with their parent menu item, and list the child menu item after each menu option (see the image above for details of this). That way, it’s easy to quickly reach for the correct sub-menu when participants ask for it, and it’s easy for the moderator to keep the navigation deck in order during a session.

The idea is that participants don’t get to see the whole of the navigation structure laid out at the same time. Instead, they have to hunt through it like they would on a regular site. The most they will see at any one time is the whole chain of menus from the top level down to the lowest sub-menu in one category. Obviously they will learn the structure after a couple of repeat visits to the same area, but that again mimics real life.

How many participants?

In some respects, this reverse sort matches the pattern of a closed card sort, where participants are given a certain number of pre-labeled piles that they can sort cards into. For that reason I tend to use similar numbers of participants. 15-20 participants should give you sufficient confidence in the results. That’s 5-7 groups of three. Obviously if you have different types of users, you’ll want to have enough participants from each user type to see whether they use the navigation menus the same way or not.

Running the participants in groups of three means that you must moderate well to ensure that every participant’s voice is heard. A non-participatory participant doesn’t count against your numbers. It helps if all the participants know each other beforehand and have a similar role or level of experience, so that one person doesn’t dominate the conversation.

How do you collect and interpret data?

For paper-based reverse sorts, I create a destinations spreadsheet with the navigation structure written out in the leftmost columns, indented to show hierarchy, with the reference numbers from the navigation cards. I write the tasks out along the top of the worksheet, one column per task. Then I can tally the results by adding up the number of times that participants chose each location in the navigation structure as the answer for each task.

Reverse Card Sort data collection — Task 2 seems to be found successfully by everybody. Task 1’s navigation areas might need a bit more differentiation. Task 3’s results are all over the place and indicates either confusion with the meaning of the task or a menu structure that doesn’t support the task.

Most cells will remain blank. If you’re lucky, all participants will have chosen the location that you wanted them to and so you’ll have a large tally against one navigation menu item for each task.

More likely, you’ll see a distribution of responses between a couple of areas in the navigation menu structure for each task. That indicates that you either need to improve the differentiation between the areas, or provide ways for users to complete that task from either location (either by adding a “related links” style link or by duplicating content).

If you see a large spread of responses for a task, then either the task was ambiguous (you’ll pick this up from whether it was placed in the “Confusion” or “Terminology” pile) or the menu structure isn’t supporting that task for your user group. Look at the comments that participants wrote on the card for that task to get more insight.

You can count the successful “hits” and turn this into a percentage score for comparison against subsequent iterations of the menu structure.

I recently used reverse sorting to test the existing structure on an intranet site as a form of comparison against the redesigned site. Using the existing structure, participants could find their tasks on average 35% of the time. This compared to a 90% finding rate for the same tasks using the final iteration of our redesigned menu structure.

There are a couple of online alternatives to paper cards. These will perform most of the analysis for you. My current favorite is Optimal Workshop’s Treejack tool. This online reverse sort tool is easy to set up, administer and analyze. It gives you more depth of analysis than you’re likely to do by hand, such as indirect success rates (did participants get to the location on a second or third attempt) and visual indications of navigation flows. Also check out the tree testing part of UserZoom’s suite.

The downside to the online tools is that they are context-free. You find out what didn’t work, but you don’t know the reason why because there is no commentary to go with each wrong or confusing task. A combination of face-to-face sorts to pilot test the structure followed by online sorts to get good participant numbers can give you the best of both worlds.

What’s next?

Unless you got very high findability first time round, there is probably room for improvement in your navigation structure. Look back through the user comments that the moderator collected. Look at the types of tasks that ended up on the “Confusion” and “Terminology” piles. See whether there are areas of the navigation structure that users gravitate to on the failed tasks. Re-jig the navigation structure based on what you learned and then run another round of reverse sorting.

I would be happy with an average of around 80% to 85% findability from a reverse sort study. In real life the participants would have had many more cues from the content on the pages they navigated to that would have helped them realize whether they were in the right place or not. Once you reach that level of success, it’s time to build the navigation structure into your code!

Overview

What goes on the cards?

How many participants?

How do you collect and interpret data?

What’s next?

Related