128 West Main Street
Dryden, NY 13053
Tel: 607.844.4011
Fax: 607.844.3228
Below you will find our recommendations for creating a stub that is optimized for document handling by machine, and that will get the best results in use with the Flex product and supported transports. This will include paper and ink characteristics, OCR (Optical Character Recognition) guidelines, and form organization recommendations. We will also cover issues related to ICR (image character recognition).

A sample bill as sent out by one of our customers:

While there are many things done right on this bill, there are a few things we would change. The customer logo on the stub acts as the logo for the entire bill, which saves space on the document, but the result is that the stub needs to be printed at the top.

This is not the best case, as we then have to rely on the bill-paying customer to properly tear off the stub and give us a good bottom edge. Even more disastrous results will occur if the machine making the perforated cut in the middle of the document is not cutting in the proper place. A whole billing cycle may come in with the OCR line in an unreadable position creating the need to key in all of the stubs.

We recommend that the stub be instead placed at the bottom of the bill, with the perforated edge(s) at the top and if necessary, on the left-hand side. This means that the edges used by the machine will be much more consistent creating higher read rates.

The bill is laser printed, which is preferred. Laser print creates a sharp OCR line, which should read at nearly 100% with exceptions only when the document is compromised by being overwritten with handwriting, going through the OCR reader while not flat and level (improperly jogged, or having a bad bottom edge), or being exposed to other situations that cause "noise" to be added the OCR line.

Taking a closer look at the stub portion of this bill (below), we see the following (frames 1-3 on the image):

Sample stub

1.  The OCR line

  • Location: The best case is to have the OCR line located at the bottom right of the document with no printed information to its right.
  • Clear Band (or no-noise zone): The area around the OCR line should be clear (all white – no printing) on the top and bottom for at least ¼ inch from the edge of the printing. Usually this means that the OCR line including the clear band will be at least 5/8th of an inch high.
  • OCR Font: A standard OCR font must be used to take advantage of the machine reading capability. The preferred fonts are OCR-A and OCR-B, printed at 10 characters per inch. Any other character spacing may cause lowered read rates. Other existing OCR fonts can be considered, but should be approved in advance before assuming that they can be read. A standard OCR font must be used to take advantage of the machine reading capability.
  • Printer Selection: As we said above, laser printed documents will have the best success with OCR. The next best is usually an impact type printer, as long as the ribbon is kept fresh during the printing process. The worst results are obtained by band (line) printers.
  • Print Registration: Another consideration is the registration of the printing. When the registration is good, it means that the document information is consistently printed in the same location on the document. If the registration is skewed too far in any direction, it will be outside the OCR head position or zone set up in the parameters.

    The clear band comes into play here with registration when working with a unit like the 7731. If a large clear band is provided, then registration can be less strict, and the OCR read zone can be made larger to accommodate the OCR scanline being in a different position. If the clear band is not as large an area, and the zone is constrained, print registration becomes critical. Another consideration is the registration of the printing.
  • OCR Field Spacing: Providing a space in between fields can be helpful to the operator when performing field completion on a misread field. It is also useful for extraction to ensure that a document does not read information from one field into another (on some machines), as the space could be used to "delimit" the field and assist the extraction process in finding the proper start of a field which occurs to the left of a field which did not read properly.

    The Panini compresses all spaces out of its OCR data, so this technique cannot be used to extract discrete fields. On the 7731 however, this technique can be used to keep a misread field from affecting the other extracted fields.

    When creating a document with space delimited fields for the 7731, do not use more than one space in between the fields. One space will be accounted for by the OCR engine. More than one space will degrade the performance of the OCR read and slow the machine’s response time down.
  • OCR Line Consistency: The location of the OCR line should be consistent between all stub definitions so that the operator is not required to change the position of the OCR read head between batches on machines that have a physical OCR reader (i.e. Panini).

    On machines that perform OCR from the image (like the 7731), it is important for performance reasons to locate all of the OCR zones in the same place. On the 7731, this includes the MICR line, as it is read as an E13B OCR line. If the zone for each stub is located in the same place as the zone for the check, then the 7731 can use the same grayscale image data for every OCR process. The grayscale data has the best read rate.

    If the OCR zone is located in a different position than the MICR line of the check, then the Flex software will attempt to create one zone that will incorporate both of the individual zones. The OCR zone used in the 7731 hardware has a maximum area of about 6 in2. If both zones cannot be combined into a zone of this size or smaller, the software will create two zones instead. The tradeoff in doing this is that the second zone (used for MICR in Flex) must be taken from the bi-level (black and white) image that was lifted. This black and white image will provide a slightly lower read rate. The same goes for multiple stub OCR line definitions. They should be kept in the same zone for best results.

2. ICR Considerations

ICR reading rates are still only marginally cost-justifiable, and so every consideration should be taken to provide for the best possible read rates. We recommend that all ICR customers have their documents test printed (if applicable) so that as many issues as possible are resolved before the site goes live with new documents.

A normal document read rate that would be considered good is probably around 60-70% doing a meter read type application as shown on the example stub. As these read rates are achieved during a batch process, saving the operator from keying 30-40% of the meter readings is a time savings that can be justified. Some stub layouts have already been set up for ICR, so if your application matches a previously set up configuration, it is recommended that you mimic that layout. Here are some of the issues to keep in mind as you design the ICR portion of your stub:

  • Character box shading: The location of the handwriting needs to be consistent for proper recognition. To ensure that the data is consistently placed, shaded boxes are provided. The shading should be done using "no-repro" blue or green "drop-out" ink, which will cause the shaded box to be invisible to the image camera. They should also not be printed solid, but be printed at a 100 and 50 screen. This will help keep an over-sensitive camera from seeing the ink even if it is within specifications.

  • Location of the ICR boxes: The boxes should have a clear band around them of at least 1/8th inch. The clearer the area surrounding the boxes, the better the chance of reading handwriting. The boxes often are not filled in properly, and if the handwriting strays out of the box, it can still be read if the area around it is the clear. The example we have above does not follow these rule for the "Amount Enclosed" text printed above the ICR boxes, and the read rates have suffered slightly because of this.

  • Spacing and Size: The boxes should be 2/10th of an inch square. The spacing between the boxes should be 1/10th inch.

  • ICR-Related Data: There are some times when the information filled in the ICR field will not be valid. In these cases, it is helpful to add some data to the OCR line to indicate that ICR is not to be performed. In the cases where the ICR data is a meter reading (as is the case above), the bill may be of a special type, where a meter reading is not desired, though the shaded boxes appear on the standard form. The meter reading may not be for the current month, so the billing month could be included on the OCR line for verification. These special cases should be considered while creating the bill so that the OCR line can be designed to include special (numeric) flags that will assist in getting the best performance possible.

  • ICR instructions: Information should be included somewhere on the bill on the importance of properly filling in the ICR boxes. An example of properly filled in boxes would also be helpful.

3. ICR Registration Mark

  • Importance of registration mark: When performing the ICR process, the software needs accurate alignment of the ICR boxes to find and interpret the handwriting properly. To provide the best possible read rates, a registration mark printed in black ink should be provided in perfect registration with the ICR box shading. Without this registration mark, the ICR process will suffer a large read rate decline. A loss of 20-30% could occur.

  • Type of mark: The registration mark should be a crossbar made of a vertical and horizontal line, or a corner (as the example above uses). This allows horizontal and vertical adjustment to be done of the image in the digital realm before processing the handwriting. The line should be of medium thickness, enough to provide a clear black line when imaged. Since we generally image documents at 200 DPI, the thickness should be probably about 5-8 pixels wide, making it about .025 to .040 inch wide. Outward from the corner of intersection of these lines, there should be no other printed information for at least 3/8th inch.

  • Preprinted Forms: When using drop out ink, a preprinted form is often used to eliminate the need for a two color laser production printer. When the document is pre-printed with the ICR boxes and customer logo, etc. the registration mark should be also be printed in black ink with careful attention to the registration of the print of each color. It is critical to keep the relative position of the ICR boxes to the registration mark perfectly consistent.

  • Multiple Registration Marks: There are times when it can be helpful to provide more than one registration mark. When this is done, usually two marks are made – one each at opposite corners of the area where ICR is being performed. This allows the software to provide scaling and rotational correction of the image. In the current line of transports supported by Flex, this is not a large concern, as scanning is nearly always a flat and level operation, and the image pixel density is relatively constant. Scaling would not be a concern. If the installer sees a scanning product in the future for their customer that may provide a less reliable image and may need these types of correction, plan up-front for registration marks at the top-left and bottom-right (or similar) so that we are ready to implement the new hardware down the road with little or no changes to the customer’s document(s).