Scanning

Introduction

In this section we provide an overview of the technical standards of scanning, and also the issues facing digital preservationists as they select sources for scanning and make decisions as to formatting, resolution, and bit-depth. We discuss the potential uses of the conventional flatbed-scanner, which scans sources and creates an image-file, and also provide a section on the relatively new technology of OCR scanning--which can "read" a text source and convert it into a word-processing file. For those in the market for a scanner, our section on Scanner Selection is intended to give some general guidelines for making your decision, and also to provide some external links to useful reviews and commercial sites.

Resolution

The resolution of a digital image is the measurement of how many pixels (ppi) or dots (dpi) there are in a given area (in both height and width) normally per inch. When you define the resolution of an image, you are determining the clarity and detail of either that particular image-file. This seems simple enough, but we've found that the issue of resolution can be confusing, not least because we need to distinguish between resolution for the web, monitor setting and scanner setting. Here we focus specifically on Scanning Resolution. Scanning at a higher resolution captures more details of the photo, resulting in a sharper image and a larger file. As you shop for scanners you'll find that scanners offer extremely high resolution. However, this refers to the level of resolution the scanner itself can produce, and not the resolution of the image file as viewed on the web. For projects where the image is going to be shown on the web, a resolution of between 72 -100 dots per inch will be fine--computer monitors can only display 72dpi. In terms of images you intend to print, a good rule of thumb is to scan images at no lower than 150 dpi and no higher than 600dpi, although this is likely to change as the technologies become increasingly sophisticated. Remember, if you plan on printing an image on a 360-dpi printer, you shouldn't scan the image at a higher resolution -- the extra details won't show up, and you'll end up with a needlessly huge file that takes forever to download.

Color and bit-depth

This refers to the amount of color depth in your image, either in the scanner or on your display. Simply put, the more color your image has, the more bits it will require. 8 colors will require 3 bits per pixel, and 256 colors will require 8 bits. In Graphic Formats for the Web we discuss how to optimize your image for the web by reducing the amount of color and thus memory the image-file takes up. In terms of scanning, it is now possible to buy a scanner that has a 48 color bit depth. You will get amazing print quality as far as color and resolution go--but this does not reflect the level of quality you will attain with images for the web. Nevertheless, this is an area that is improving at breakneck speed and the number of users with monitors that can show more than the 256 colors of the web increases daily. But again, although you may have top-of the line technology available to you, considering what types of monitors available to your user is also vital. That said, many monitors can now increase the level of color shown in an image file to "High Color" (16-bit/32k). We're now even seeing "True Color" (24-bit/16 million colors). In other words, you can see many more colors than on the typical display of 256, and while this will not be an issue for images you intend to display on the web, it will be significant as you create a nonaccessed database of Master Image Files for projects such as Digital Preservation.

Types of scanners

We can't tell you which scanner to buy, and obviously this will primarily be a question of budgeting. However, price is not necessarily a reliable indicator of quality. Issues you should consider include: scanning dpi (dots per inch), speed, size of scanning area, accompanying software, types of materials you will be scanning (slides, photographs, text). According to NARA, Master Files of photographic images should be scanned at a resolution of a whopping 3,000 dpi--but this is only important if you are planning a hugely extensive digital preservation project. However, Accessed Files of photographic images (i.e. - those that can be viewed from the web) should have a resolution of around 72-120 dpi depending on the complexity of the image. Most scanners are categorized according to resolution and speed, so you'll be able to make decisions according to your specific needs. As you consider this type of project it will be necessary to do thorough research into the products available and the recommendations of your field. The most valuable advice will come from other archivists, librarians, and curators who have completed similar projects.