Variability in User Performance

by Jakob Nielsen on May 15, 2006

Summary: When doing website tasks, the slowest 25% of users take 2.4 times as long as the fastest 25% of users. This difference is much higher than for other types of computer use; only programming shows a greater disparity.


Anyone who's done user testing knows that there are tremendous individual differences among users. Some people sail through user interfaces, while others get bogged down. Even if you've never performed a formal measurement study, you've probably noticed that the fastest users are much faster than the slowest ones.

To better grasp this variability, we can look at the ratio between the top and bottom quartiles of a given study's measured task times:

  • Q3 is the highest number in the third quartile (the slowest 25% of users).
  • Q1 is the highest number in the first quartile (the fastest 25% of the users).

More precisely, at Q1, 25% of users are faster and 75% are slower; at Q3, 75% of users are faster and 25% are slower. So, half of the users lie between Q1 and Q3 and the other half are evenly distributed outside this interval. We divide Q3 by Q1 to compute the Q3/Q1 ratio as a measure of individual differences between users at the low and high ends of performance.

Example Study Results

The following figure shows an example of Q1 and Q3 from one of our eyetracking studies, in which seventy-six users tried to find the location of the Agere Systems corporate headquarters using the company's website. The data plotted is for the forty-eight users who identified the correct city. (While we don't use the times from the twenty-eight users who failed the task, understanding what caused their failure is obviously important as well; see separate article on how to improve the "about us" info on a website.)

Bar chart: each bar is the time needed by one user to find the location of a company's corporate HQ
Distribution of user website performance. Each column represents the task time for one user. Green columns represent the first quartile (fastest 25% of users); red columns, the last quartile (slowest 25%); and blue columns, the middle two quartiles. Q1 is the time that separates the green and blue columns. Q3 is the time that separates the blue and red columns.

Q3/Q1 shows how much better a fast user is than a slow user. That is, it compares a somewhat fast user (25% are even faster, but 75% are slower) with a somewhat slow user (25% are even slower, but 75% are faster). The ratio doesn't consider the very fastest or very slowest users, since they're likely to be outliers.

In the figure, Q1 is 65 seconds and Q3 is 188 seconds, so Q3/Q1 is 2.9. The very fastest user found the headquarters location in 28 seconds, while the very slowest user required 420 seconds for the same task. Thus, the max/min ratio was 15. Because we're typically more interested in mainstream user experience, I'll focus here on Q3/Q1 rather than on the scale's endpoint extremes.

Across our recent measurements of seventy website and intranet tasks, Q3/Q1 = 2.4. In other words, slow users spend more than twice as much time as fast users on the same task.

Comparison With Other User Interfaces

We can compare the Q3/Q1 ratio for Web use with other user interfaces by drawing on study results from traditional computer use.

Dennis Egan has compiled studies of three interaction types: text editing, information search, and programming. In my own unpublished 1994 study, I collected performance metrics for common personal computing tasks on three different systems: Macintosh System 7, NeXTStep (the foundation for Mac OS X), and Windows 3.1. Sample tasks included adding up sales figures and sending the results out by email.

The following table shows the average Q3/Q1 ratios from the traditional-use studies compared with that of Web use:

Type of Use Q3/Q1
Text editing 1.8
Personal computing 1.9
Information search 2.2
Web use 2.4
Programming 3.0

Text editing is the simplest task in the table, and depends mainly on physical abilities such as typing speed and homing time (moving your hand between the mouse and keyboard). Specifically, the studies looked at the mechanics of moving paragraphs around, bolding words, fixing typos, and so on. The studies didn't examine how long it would take to actually compose a letter or write a book, which would obviously show far greater variability among users.

Personal computing is a bit more difficult, and can require many complicated tasks, such as using spreadsheet formulas and integrating multiple applications to achieve a single goal. My study, however, looked only at basic office productivity tasks — not convoluted issues like configurating a firewall. Because basic PC use is within most people's mental abilities, variability is fairly low.

The last three types of computer use show high variability because they require higher mental processes such as reasoning abstractly, planning multiple steps, juggling numerous observations in short-term memory, and interpreting new information relative to existing knowledge.

Information search and Web use are quite similar, which isn't surprising since information search forms a large component of Web use. However, beyond simple search engine use, website use often requires sorting, scanning, and interpreting many types of listings — from news headline lists to category pages with supposedly similar products. Also, once website users find information, they have to actually interpret it. In addition, website users often must determine how to gather additional information (if, for example, their first solution lacks sufficient credibility or fails to provide everything they need).

Finally, programming demands the most of users and thus shows the highest variability. This table shows why the number one guideline for managing software development is to hire the best developers: Good developers are three times faster than slow ones and offer companies tremendous gain — even when they require higher salaries. (The difference between the very best and very worst developers is typically about a factor twenty. Unfortunately, not everybody can hire only the top 1% of developers. But you can certainly endeavor to hire from the top 25%.)

The Web is Difficult

The more difficult a problem, the more individual differences we see. As we approach the limits of human capabilities, the benefits of additional brainpower — mental abilities, talent, or whatever you want to call it — increase.

When using a website, for example, a user who can hold six chunks of knowledge in short-term memory has great superiority over someone who can hold only four chunks. The user with the better memory is less likely to repeatedly go down the wrong path and more likely to correctly assess how a given page relates to previous pages. In contrast, a higher-capacity short-term memory doesn't help much in simple text editing tasks, assuming you have a decent word processor that doesn't require you to remember six things to move a paragraph.

As the table shows, the Web has the second-highest individual variability of the five computer-use types.

This high variability is bad because it results from a degraded user experience for some people. After all, fast performance measures show that it's possible to complete a website task within that time. Anything slower is a result of users being delayed or sidetracked by usability problems. In the perfect user interface, people should have no doubt about what to do at any time and run no risk of making a wrong move. Given this, all users would perform about the same, with only minor differences caused by factors such as how fast they can click the mouse.

Programming has the largest individual differences and is the most difficult task category. However, programming is not the worst problem because we can legitimately select our programmers from among those with the best performance. That is, the solution is simple: don't hire bad programmers.

For websites, we don't have the luxury of selecting only the best users. We must cater to the people who visit our website, regardless of their abstract reasoning skills. People in the last quartile are customers, too.

Even for intranets, it's unacceptable to demand only highly skilled users. Intranets are for all employees, and in many job categories, advanced computer skills are not the most important hiring criterion.

For government sites, it's literally true that they must serve all users: We all pay taxes, and we all deserve decent service in return.

Commercial websites don't always need to support a broad consumer audience — B2B sites, for example, can often assume some user expertise in their site's specialty. But even if you sell, say, temperature measuring probes and your target audience is measurement engineers, you can't assume they're all in the top 25% of engineers. A distribution of skill and knowledge will still exist, and some prospective customers will be buying your type of probe for the first time in their careers.

Because website interfaces are challenging and must serve audiences beyond the elite, it's particularly important that we tighten up the Web user experience and reduce variability in user performance.

For more on how to measure users' performance with a design (and how to deal with the inevitable variability), see our full-day course Measuring User Experience.)

Reference

Dennis Egan, "Individual Differences in Human-Computer Interaction," in Handbook of Human Computer Interaction, Martin Helander (ed.), Elsevier Science Publishers, 1988, pp. 543-568.


Share this article: Twitter | LinkedIn | Google+ | Email