Gateway Pages Prevent PDF Shock

by Jakob Nielsen on July 28, 2003

Summary: Spare your users the misery of being dumped into PDF files without warning. Create special gateway pages that summarize the contents of big documents and guide users gently into the PDF morass.


My previous Alertbox explained why PDF is unsuitable for presenting information online. Now, let's see what we can do about the problem.

Solution: Gateway Pages

In my previous column, I quoted users who hated several design decisions:

Four different departments, four types of data, one result: user misery. Websites use PDF despite its weaknesses because it supports ease of posting, even as it denies ease of use. Basically, content providers save money by not having to convert the information into a Web-suitable format.

Ideally, companies would reformat each type of information for online use. It's actually not very expensive to, say, create a set of Web pages for annual report information as long as the Web design is done while the annual report is being written. The cost comes when companies have a glossy annual report already finished and then say, "Webbify this."

If you distribute documents for printing or if you absolutely have to repurpose existing content into a substandard user experience, at least protect your users from nasty surprises. Create a gateway page for each PDF document and make sure that users are always guided through the gateway:

  • All links to the information should be to the gateway page; none should go directly to the PDF file.
  • The gateway page should include a short summary of the PDF file so that users can assess whether they want to go to the trouble of entering PDF-land.
  • The gateway page should clearly warn users that they'll be getting a PDF file. It should also state the file's page count and download size.
  • Break big PDF files into sections and offer separate links into each one, with a brief summary of the content next to each link. Also, provide a link to a single file that includes all pages, and tell users to use this link if they want to print the document.
  • Consider adding instructions for how to download the PDF file without the annoyance of having it open in the browser. Unfortunately, this is difficult for average users to do with current technology; it would be nice if there were a special type of link that would always download a file rather than displaying it.

If you refer users to PDF documents on other websites that follow these guidelines, always link to the gateway page, not directly to the PDF.

Finally, on the gateway page, follow the guidelines for opening PDF files in new windows.

Thwarting the Search Engine Spider

If you have PDF files on your website or intranet, it's essential that you prevent search engines from including these documents in results listings. This is true for both internal search engines and public search engines. When search engines index PDF files, they unceremoniously dump users at the first page, even though the terms they searched might be deep within the document. This is incredibly confusing and unhelpful.

Internal search engines often have a setting that specifies the data types it should spider. If you simply turn off PDF files, your users can breathe easily. You might also want to block users from directly accessing other complex data types that require a summary page or other navigation aide.

Public search engines don't allow the same level of control. The sidebar offers a few ways to keep undesired files out of search engines. Unfortunately, the available techniques are all kludges at the current stage of technology.

You can substantially enhance usability by getting public search engines to guide users to the gateway page rather than to the PDF file itself. The one downside is that your site will have slightly reduced search engine visibility since some query terms will occur in the PDF's full text, but not elsewhere on your site.

If you're desperate for traffic, you might choose to reduce usability and let search engines index PDF files. Unfortunately, users who are dumped into your PDF files will rarely turn into loyal customers, since they're not going to benefit from your site-wide navigation and visit other parts of the site. Also, most of those users will never find what they were looking for (because it's probably on page 82), so while they might visit once, they're not going to think very highly of your site.

PDF traffic is worthless traffic. Having search engines index great masses of unnavigable, full-text documents is truly a strategy for the desperate, and not one I recommend.

PDF for Interactive Forms

The latest attempt to sell PDF is to position it as a tool for data capture through interactive forms. Although not nearly as problematic as the traditional monolithic documents, this is still a bad idea.

Forms are the wrong metaphor for workflow support. It's much better to view data entry as an Internet-based application (or intranet-based application, as the case may be) and design a true user interface -- one that takes advantage of all of the GUI elements, conditional workflow structures, and user assistance techniques that have evolved through decades of interaction design for applications.

For thirty years, one of the most fundamental tenets of human factors engineering has been that you should not blindly computerize the way something was done in the past. Existing processes are usually suboptimal and designed under the old technology's constraints. New technology alleviates these constraints and offers us different ways of solving the problem, while introducing new constraints that we must address for the new solution to work.

Office automation is a failed paradigm that has cost companies billions of dollars in lost productivity through substandard "enterprise solutions" that are difficult to use and fail to generate better workflow. Let us not repeat this failure by screen-converting hundreds of complex forms that are usually no good to begin with.

Use Tools for Their Strength

Rather than force it to solve problems that it's much less suited for, let's reserve PDF for what it's good at: printing. It's no disgrace to be the world's greatest solution for a single problem, especially one that's as common and important as printing.

No single answer will address all computing problems. Multiple tools are okay and will result in a much stronger overall user experience.


Share this article: Twitter | LinkedIn | Google+ | Email