Navigate the Document Imaging Wiki through the navigation at the top of the page,
or simply type in the term you are looking for in the search pane.
Image Formats:
TIFF — Tagged image file format; a container for storing images; historically, bi-tonal, Group 4 compressed TIFFs have been the most popular file format used in document imaging applications.
PDF — Portable document format; a document file format that can encapsulate both image and electronically generated documents. PDF was introduced by Adobe in 1993 and since then the company boasts that more than 500 million free PDF readers have been distributed. Originally dismissed as a document imaging format because the files can be cumbersome to view and because of the proprietary nature of the format, since the early 2000s, PDF has gain widespread adoption in document imaging applications because of its universality.
By 2007, PDF had become widely accepted as a de facto standard, and in addition to Adobe, several hundred other ISVs were developing software to create PDF documents. That year, Adobe submitted version 1.7 of PDF to ISO (International Standards Organization) for approval as an international standard. Future versions of PDF will be approved by an ISO committee.
PDF supports several image compression methods, including Group 4, JBIG2, JPEG, and JPEG 2000. Segmentation and layering techniques, including MRC, can also be used to combine multiple compression methods to create optimum PDF image file sizes.
Several subsets of PDF have also been released over the years, including PDF/A and PDF/X
PDF/A — A subset of PDF designed for long-term storage of electronic documents. The PDF/A spec is laid out so that PDF/A documents created today, will be viewable with a PDF/A viewer in perpetuity. It is designed to guard against the history of obsolescence associated with electronic document storage. (http://www.aiim.org/standards.asp?id=25013)
PDF/X — A subset of PDF that is aimed at graphics and engineering documents.
DjVu — An advanced compression file format created through techniques similar to MRC. Different compression methods are used for text and graphics in DjVu images, which results in optimum file sizes. DjVu was originally developed by AT&T Labs and primarily targeted at applications that involved viewing images over the Web. In 2000, AT&T Labs sold t he DjVu technology to Seattle-based geospatial compression specialist. Because of DjVu’s nature as a primary format, it never gained widespread adoption. One PDF began supporting MRC, DjVu was further marginalized.
JPEG — The acronym actually stands for 'Joint Photographic Experts Group,' the committee, which created the standard image compression and rendering code used in JPEG files. The standard was approved in 1994 as ISO 10918-1. The JPEG file format was developed on basis of this standard. Most color document scanners offer color JPEG output. Because it was designed primarily for graphic images, it will sometimes create artifacts associated with text that make it a less than ideal document image format.
JBIG2 — Introduced in 2000, JBIG 2 is a bi-tonal compression format that offers up to 10 times smaller files than Group 4 compressed files. It utilizes substitution of symbols for characters.
JPEG 2000, Part 6 — A document imaging-specific version of the JPEG 2000 file format. Images are stored as JPM files, which can incorporate mulitple methods of compression, including JPEG and JPEG 2000 for graphic components of a document and Group 4 and JBIG2 for textual components.
Bitmap — The BMP file format; an uncompressed image file format; Typically, BMP files are too large to be useful in document imaging applications.
XML — Xtensible Mark-Up language; a file format designed to carry data between applications;