PDF Files

Introduction

The Portable Document Format (PDF) is a type of file that was invented by Adobe in the early 1990s for the purpose of sharing printable documents that look exactly the same, and print properly, on any kind of computer, which now applies to tablets and phones as well. All modern computers and devices can display and print PDFs natively. For older computers, PDF reader software is still freely available for download from several publishers, including Adobe.

Product brochures, manuals, technical guides, and government forms commonly use PDF. Desktop publishing artists and printers use PDF to share final proofs, and architects use it to share plans. The IRS was an early adopter, distributing tax forms, instructions, and other publications in PDF format. PDF files can include fill-in forms, which can be printed or saved with the information you type in.

While you can view a PDF file on virtually any device, you can't necessarily edit PDF files (apart from filling in form fields). This requires additional software, which is usually not free.

Viewing PDF Files

As mentioned, you can use the built-in capabilities of modern computers and devices to read and print PDFs, or, for better functionality, you can install third-party PDF software.

A very functional and popular reader for a Windows-based computer is Adobe Acrobat Reader. If you haven't or can't download this, you can view PDFs in any modern web browser. In addition, Windows 8-based computers and devices include the Microsoft Reader full-screen app. The Reader app was included with Windows 10 in 2015, but was removed with updates to Windows 10 mobile phones in 2016, and Windows 10 computers and tablets in early 2018.

For Mac computers, the built-in Preview application will display PDFs, but it has trouble with fillable PDF forms. You can also download Adobe Acrobat Reader for the Mac, or use a web browser (such as Safari or Chrome).

If you double-click on a PDF file saved on your computer, network folder, or thumb drive, and your web browser is the only application that will open PDFs on your computer, then your web browser will open and display your file alongside web pages you might have open in other tabs. Otherwise, it will open in your a standlone PDF viewer software.

When you click on a link for a PDF file while browsing the web, what happens depends on the configuration of the web server and your computer and web browser. It may download like a regular file into your Downloads folder, requiring you to click it again to open it. Or, it may open in the same or a new tab of your web browser, so transparently that you might not even realize you're looking at a PDF. If you have Adobe Acrobat Reader installed, the PDF may open separately in Acrobat Reader, or it may open in a tab in your web browser but with the Acrobat Reader toolbars (zoom, print, etc.) instead of the PDF tools native to your web browser.

Mobile devices function about the same. If you tap a PDF in a web browser or file browsing app, the PDF will open in a PDF reader app if you have one installed, or it may use your web browser to display it.

Filling in PDF Forms

When a PDF file has been created with fillable form fields, you can enter information in the fields using the same software you use to view PDFs. But, a couple of caveats:

  1. If you print a filled-in form from a web browser, the data will be there, but when you save the file, your web browser may not save the data with it. If you need to save it with the data, download the PDF to a folder on your computer, then open it in a separate application, before filling it in.
  2. If you use a Mac, you should download and install Adobe Reader to fill in a PDF form, rather than using Preview. Preview may not properly save the data you fill in on PDF forms.
  3. The original author can design the PDF not to allow data to be saved, in which case you have to print the form with the data you filled in before you close it if you want to keep a copy.

Creating PDF Files

Save As / Export

To make a PDF file with original content, you can create the document first using a word processor such as Microsoft Word or Google Docs, or a graphics program such as Adobe Illustrator. Then, use that program to save the file as a PDF. Depending on your program, the menu option to use might be Export instead of Save As.

All modern word processors and desktop publishing programs support the ability to save a file as a PDF, so you can send your work output to someone else without having to send him your editable word processing or graphics document that might contain past revisions, private notes, etc.

PDF Printer

If your application is not able to save your work into the PDF format, you can use a PDF printer, also called a distiller. This is a program that installs a virtual printer on your computer, where the output goes into a PDF file instead of to an actual printer. To create the PDF, you would open the original document, select the option to print like normal, but select the PDF printer instead of your regular printer. You will get a pop-up window giving you the opportunity to assign a file name and select the location to save your new PDF. This is particularly useful for saving a web page to your computer, since trying to save web pages directly to your hard drive is a messy process, compared to printing them to a PDF.

The Apple Mac running OS X has had a PDF printer built in to the operating system for many years. Microsoft only included one as of Windows 10 in 2015. For earlier versions of Windows, free and paid commercial applications are available that enable printing to PDF.

The Google Chrome web browser for Windows has a built-in virtual PDF printer, convenient for printing web pages to PDF.

Scanners

If you have a hard copy of a document, you can scan it to a PDF file using any modern scanner. All current network scanners give you the option to save a scanned document to your server shared folder, or send it via e-mail as a PDF. If it doesn't, you can start with PDF software (such as Adobe Acrobat and products by its competitors Foxit, Nitro, Nuance, and Tracker), and choose the option to create a PDF from your scanner.

Using a scanner, of course, is the only option for things not already on your computer—an old diploma, a love letter from high school, or the ticket you got for rolling through a stop sign.

Editing PDF Files

As mentioned above, the PDF format is optimized for display and printing, so PDF files are not as easily editable as word processing documents. Inside the PDF, the layout of all the text and graphics is pretty much fixed on each page.

Full-featured PDF software (such as Adobe Acrobat and products by its competitors Foxit, Nitro, Nuance, and Tracker) enables you to renumber, insert, remove, crop, rotate, and reorder pages, add a watermark, resize and change margins, add bookmarks and comments, and combine files.

The latest versions of these software products now support editing text in a PDF file directly, almost like a regular word processor. But, each block of text will still be within a separate layout box. Adjacent boxes won't move, and text will overlap. Inserted text will not flow to the next page automatically. So, you will find yourself having to resize and move text boxes around, which can get tedious. And you may encounter quirks in the appearance of the fonts.

So, if you must, only edit a PDF document directly if you did not create it yourself, or if you scanned it in from a hard copy. If you are working with a PDF for which you have the source document on your computer, just edit the source document in its original application, and regenerate the PDF.

Searching for and Copying Text in PDF Files

One interesting aspect of the PDF file format is that it combines the characteristics of a word-processing document with a photographic image file.

A pure word-processing document, such as a Microsoft Word or Google Docs file, stores text as codes that represent the letters, numbers, punctuation, typeface/font, size, layout, etc. These files do not easily integrate hand-written notes or other markings around or on top of the text. But, because of the files' structure, your computer can search through and index documents based on the text they contain, and you can easily copy words or paragraphs elsewhere, such as into a new e-mail message, or to post a comment on a web site.

On the other hand, an image file will show exactly what a physical document looks like, with all the nuances of your handwriting and other marks. In such a file, your computer does not know how to read the text, and so you cannot search for words and phrases, nor copy and paste.

The PDF file format combines the best of both. That is, it can embed machine-readable text, while at the same time have your hand-written notes or signature appear on the same page as, or even surrounding, the text. This allows you to have a PDF file that looks exactly like a hard-copy document, combining typed text, notes, drawings, etc., with the ability to search for or copy text.

However, a given PDF file will not be searchable if it were created as an image, without the text being recognized as text.

Creating Searchable PDF Files

If you export from a word processor, desktop design application, or web browser to a PDF file, the text will be searchable; there is nothing else to do.

When you scan a physical document, the process of converting the words on the page into machine-readable, searchable text is called Optical Character Recognition (OCR). This process involves the computer actually reading the text the same as a human does, and converting them to the codes used in searchable documents. While all scanners these days will scan your documents to a PDF, not all of them natively perform OCR.

If your scanner supports OCR, then creating a searchable PDF from a physical document is as easy as scanning the document.

If your scanner does not support OCR, you must use separate software. Some non-OCR scanners come with a CD containing OCR software you can install on your computer, most commonly ScanSoft's OmniPage (part of the PaperPort Suite, now owned by Nuance), which has shipped with many different brands of scanners for years.

Adobe Acrobat Standard or Professional (that is, the paid editions, not the free Adobe Acrobat Reader), and other full-featured commercial software by its competitors, will also perform OCR on scanned documents, or any PDF you might happen to receive that isn't searchable.

A scanner that does the OCR natively is a more robust solution, because anyone in your office that uses it will automatically have it applied. If it's done only through separate software, you need to install it on the computer of every user who will scan, and ensure that the scanned PDF file is then passed through the software for the OCR to take place.

So, check the specifications of any scanner you're planning to buy to determine whether the scanner itself performs the OCR, whether it comes with OCR software, or whether you might need to use separate OCR software to scan documents to searchable PDF files. You might choose a scanner without native OCR capabilities if you happen already to have a paid edition of Adobe Acrobat, or Nitro PDF, Foxit PDF, Nuance Power PDF, or Tracker PDF software.

Finally, different OCR solutions may have different capabilities. The most rudimentary will convert to raw, unformatted text, while more sophisticated software is capable of recreating the page with the same formatting, typefaces, styles, and layout as the original document.

Converting PDFs to Editable Documents

Sometimes, you have only a PDF, and you want to open it in a regular word processor like Microsoft Word. Or, you might want to convert a PDF containing rows and columns of data into a Microsoft Excel file. Here are some options:

To Microsoft Word

Microsoft Word 2013 and later for Windows can open a PDF file directly, and convert it to a Word file, including graphics. Full-featured, commercial PDF software, such as Adobe Acrobat or alternatives by Foxit, Tracker, Nitro, or Nuance, all convert PDF files to Microsoft Word format as well.

If you don't have any of the above, you can upload a PDF file into Google Drive for free, and it will be converted to a Google Apps document, which can then be saved as a Microsoft Word document.

Also, if you open a PDF file in Adobe Reader for Windows or Mac, you will see links in the Tools pane to Adobe's online services; the Export PDF service will convert your PDF to a Microsoft Word file. It's not free, but it's not expensive at all, and may be a good option if the others listed here aren't available to you.

To Microsoft Excel

Microsoft Excel does not support importing or converting a PDF, but software such as Able2Extract from Investintech is available for purchase, which can extract data from PDF files and create an Excel worksheet from it. Adobe Acrobat and alternatives by Foxit, Nitro, and Nuance can also convert PDF files to Microsoft Excel format.

As for online services, Nitro Software had a web page where you could upload a PDF, and it would e-mail it to you as an Excel spreadsheet for free, although that seems to be gone. And Adobe's Export PDF (described above), while not free, supports converting to Excel.

Each of these options might provide output of varying quality. The conversion process is more of an art than science, and the program doing the conversion has to make a lot of assumptions, especially for Excel. If you find one of the options above doesn't do the job, try another.

Also, it's important to remember that the online conversion tools (Adobe Export PDF, Google Docs, and Nitro, as well as others not mentioned above) are only appropriate for PDF files with non-confidential data, since your PDF will be uploaded to the software provider's servers. For sensitive information, use the desktop applications mentioned above.

Security

You're probably wondering what security aspect would relate to a printable document. Well, Adobe Acrobat Reader has long been able to execute JavaScript program code embedded in a PDF file, as well as Flash-based video. As for JavaScript, the idea was to allow the PDF reader software to respond to what you're entering into a fillable form. For example, it can calculate totals, look up data to copy fields or fill in related entries as you fill in the form, pop up a message box if an invalid date is entered (like February 30), etc. The Flash video capability is mainly a way to keep PDF more relevant as we move (theoretically, at least) to a paperless economy.

Anyway, whenever there are elements in a data file that involve program instructions that execute on your computer just from opening the data file, this creates a potential security concern. It is possible for someone to create a PDF with malicious program instructions; that is, code designed to do things to your computer which the PDF reader software did not intend to allow. When you get an anonymous junk e-mail with a PDF file attached, it probably contains malicious code.

Apart from not opening PDF files promiscuously, the best thing to do is ensure you have up-to-date PDF software, and ensure Protected Mode is enabled if you use Adobe products (which it is by default). If you are still running Adobe Acrobat or Acrobat Reader version 9 or earlier (which don't support Protected Mode) and can't upgrade, you should disable JavaScript in the program options.