PDF417 is a two-dimensional (2D) bar code symbology that can encode large amounts of data in a compact and secure way. It was developed by Symbol Technologies in 1991 and standardized by ISO/IEC in 2006. PDF417 is widely used in applications such as identification cards, transport tickets, inventory management, and postal services.
A PDF417 symbol consists of a stack of rows, each containing a pattern of bars and spaces. Each row has a start and stop pattern, and a row indicator that specifies the row number, error correction level, and other parameters. The rows are separated by a quiet zone, which is a minimum gap of white space between the rows.
The bars and spaces in each row are divided into 17 modules, each of which can be black or white. Each module can represent one of four possible values: 0, 1, 2, or 3. The values are encoded using a cluster pattern, which is a sequence of four bars and spaces. There are three possible cluster patterns for each value, depending on the row number. The cluster patterns are designed to minimize the effects of printing and scanning distortions.
The data in each row are encoded using one of six possible modes: text, numeric, byte, mixed, control, or error correction. The text mode can encode alphanumeric characters and some punctuation marks. The numeric mode can encode digits from 0 to 9. The byte mode can encode any 8-bit value. The mixed mode can switch between text and numeric modes within a row. The control mode can change the encoding parameters, such as the row length, the number of rows, or the error correction level. The error correction mode can encode the error correction codewords that are used to detect and correct errors in the symbol.
A PDF417 symbol can have different formats depending on the amount of data and the level of error correction required. The symbol can have up to 928 codewords (including data and error correction codewords), up to 30 rows, and up to 34 modules per row. The symbol can also have optional features such as macro PDF417 (which allows multiple symbols to be linked together), truncated PDF417 (which omits the right-hand side of the stop pattern to save space), or micro PDF417 (which uses smaller modules and fewer rows for very small symbols).
The dimensions of a PDF417 symbol depend on the number of rows, the number of modules per row, the module width (X-dimension), the module height (Y-dimension), the aspect ratio (the ratio of X-dimension to Y-dimension), and the quiet zone size. The recommended values for these parameters are specified in ISO/IEC 15438:2006. For example, for a symbol with 18 rows and 17 modules per row, the recommended values are:
The resulting symbol size would be:
A PDF417 symbol can have different levels of error correction, ranging from level 0 (no error correction) to level 8 (the highest level). The level of error correction determines how many error correction codewords are added to the symbol and how many errors can be corrected. The error correction codewords are generated using a Reed-Solomon algorithm, which is a type of forward error correction technique that can correct both random and burst errors.
The number of error correction codewords (EC) depends on the level of error correction (L) and the number of data codewords (DC) in the symbol. The formula for calculating EC is:
EC = 2 + 1
The number of errors that can be corrected (NE) depends on the number of error correction codewords (EC) and the number of erasures (E) in the symbol. An erasure is a codeword that is known to be corrupted, such as a missing or unreadable row. The formula for calculating NE is:
NE = floor((EC - E) / 2)
For example, for a symbol with level 5 error correction, 600 data codewords, and 10 erasures, the number of error correction codewords would be:
EC = 2 + 1 = 33
The number of errors that can be corrected would be:
NE = floor((33 - 10) / 2) = 11
The reference decoding algorithm for PDF417 is specified in ISO/IEC 15438:2006. It describes the steps for decoding a PDF417 symbol from an image, such as a scanned or photographed document. The steps are:
A PDF417 symbol can have various application parameters that specify how the data should be interpreted or processed by the application that reads the symbol. Some of these parameters are:
The application parameters can be encoded in the symbol using special control codes or macro PDF417 segments. The application that reads the symbol should be able to recognize and process these parameters accordingly.