Impact of Data Quality on the Web User Experience

by Jakob Nielsen on July 12, 1998

The Web is all about content. What if this content has errors? Then users may be incapable of using the site since much of their navigation is content-driven, especially when searching. Examples:

  • On the soundtrack for Evita is missing from the list of Madonna's recordings. Most users would think: "I searched for Madonna; I got a list of her CDs; the one I want is not on the list: this shop obviously doesn't carry it, so I must go elsewhere to buy this CD!"
  • One of my friends recently recommended the book Heather has Two Mommies to someone, who went to to find it. First she searched for "Heather Mommies" and failed to find it. (A usability problem with search conventions.) Her second try was typing in the whole name of the book. Again, failure.
    When the poor user called my friend back to get the author (persistent!), my friend tried her own search on "Heather, Mommies" (note the comma) and located the book. Turns out that in the database it's listed as Heather has 2 Mommies .
  • Yet another friend recommended Priscilla Salant and Don Dillman's book How to Conduct Your Own Survey as a great book on survey design. Searching for Salant (the first author) resulted in a list of books with the desired title mentioned as being out of print. I almost gave up, assuming that the book was not available any more. Hypertext to the rescue: on the page for the out-of-print book, the second author's name is a hypertext link to a list of his books. Since it's so easy, I clicked on this link to see if Dillman had written other interesting books. Great was my surprise when the list included a different link to the survey book (I could tell that it was a different book because its link color was that of unvisited links). Following this link gave me the desired book (a second edition).
  • I contacted Compaq's tech support to complain that my new 400 MHz machine couldn't boot. They promised to send out a repair tech under the warranty, but nobody showed up. While checking into the matter, I had difficulties because I had reported the defect on one date but the trouble ticket was dated the next day in Compaq's system. Reason: my original action happened at 11 PM California time but their system is somewhere in Texas where it was 1 AM the next day. Solution: Always display dates and other user-related data in terms that relate to the user's circumstances and not to your own system's internal records.
  • Broken links annoy users and deprive them of the value they would have gotten from the destination site.

In each of these examples, a small error in data quality made the site almost useless for users. Most users will give up when their reasonable attempts result in failure, though power users can often find a way around quality problems. Do not assume that the majority of users are power users, even if you are one yourself.

Prevent Errors

The best solution to data problems is obviously to eliminate them before they reach the site. Traditional guidelines from the days of mainframe data entry are:

  • Never retype a piece of data; always copy it unchanged from a reliable source if available. Even better: only store each piece of information once in your database and use pointers everywhere else it is used. Then if the info needs to be changed or corrected, it only needs to be updated in one spot.
  • Validate entries against reasonable parameters. For example, only accept legal dates; raise a warning flag if the price of a product is more than twice or less than half the cost of any other product on the site. If entering the name of an author into a book database, check whether the name is in the database already. If yes, it is likely spelled correctly. If not, perform a spell check and display a list of possible alternatives to the data entry clerk.
  • Double your work by having all data entered twice by two different people. If the entries match, then they are usually correct.

Discount Quality Control

If it is not feasible to ensure perfect data quality, then focus on double-checking the most important information. For example, have a second person inspect the records for all best-sellers to ensure their accuracy. If an item is known to be a best-seller off the Web but doesn't sell well online, then double-check its record.

Correct Errors

Enlist the millions of Web users as a distributed quality control department by making it easy for users to report any errors they spot on your site. Have a simple form that can be filled out in less than a minute: remember that the users are doing you a favor. Some sites may even run a modest footer on every page where people can click to report errors on that page.

Offer a small reward to the first user who reports any given error or have regular drawings for a larger prize. The error form should state that user's contact info will be used for awarding the prize and in the rare event that it becomes necessary to ask a follow-up question regarding the error report. Also make it possible for users to remain anonymous when reporting errors.

Function Despite Errors

Despite the best-laid plans, most sites will continue to have some data errors, so their user interface must be error-tolerant and protect users against the effect of the errors.

Almost no search engines do spelling checks, but Barnes & Noble's does. So they pull up my books even if a user spells my name Jacob Nielson (or if the user spells correctly but their database is in error).

Share this article: Twitter | LinkedIn | Google+ | Email