|
While there are many things done right on this
bill, there are a few things we would change. The customer logo on the stub acts as the
logo for the entire bill, which saves space on the document, but the result is that the
stub needs to be printed at the top.
This is not the best case, as we then have
to rely on the bill-paying customer to properly tear off the stub and give us a good
bottom edge. Even more disastrous results will occur if the machine making the perforated
cut in the middle of the document is not cutting in the proper place. A whole billing
cycle may come in with the OCR line in an unreadable position creating the need to key in
all of the stubs.
We recommend that the stub be instead placed at the bottom
of the bill, with the perforated edge(s) at the top and if necessary, on the left-hand
side. This means that the edges used by the machine will be much more consistent creating
higher read rates.
The bill is laser printed, which is preferred. Laser print
creates a sharp OCR line, which should read at nearly 100% with exceptions only when the
document is compromised by being overwritten with handwriting, going through the OCR
reader while not flat and level (improperly jogged, or having a bad bottom edge), or being
exposed to other situations that cause "noise" to be added the OCR line.
|
|
Taking a closer look at the stub portion of
this bill (below), we see the following (frames 1-3 on the image):

1. The OCR line
- Location: The best case is to have the OCR line located
at the bottom right of the document with no printed information to its right.
- Clear Band (or no-noise zone): The area around the OCR
line should be clear (all white no printing) on the top and bottom for at least ¼
inch from the edge of the printing. Usually this means that the OCR line including the
clear band will be at least 5/8th of an inch high.
- OCR Font: A standard OCR font must be used to take
advantage of the machine reading capability. The preferred fonts are OCR-A and OCR-B,
printed at 10 characters per inch. Any other character spacing may cause lowered read
rates. Other existing OCR fonts can be considered, but should be approved in advance
before assuming that they can be read. A standard OCR font must be used to take
advantage of the machine reading capability.
- Printer Selection: As we said above, laser printed
documents will have the best success with OCR. The next best is usually an impact type
printer, as long as the ribbon is kept fresh during the printing process. The worst
results are obtained by band (line) printers.
- Print Registration: Another
consideration is the registration of the printing. When the
registration is good, it means that the document information is
consistently printed in the same location on the document. If the
registration is skewed too far in any direction, it will be outside
the OCR head position or zone set up in the parameters.
The clear band comes into play here with registration when working with a unit like the
7731. If a large clear band is provided, then registration can be less strict, and the OCR
read zone can be made larger to accommodate the OCR scanline being in a different
position. If the clear band is not as large an area, and the zone is constrained, print
registration becomes critical. Another consideration is the
registration of the printing.
- OCR Field Spacing: Providing a space in between fields
can be helpful to the operator when performing field completion on a misread field. It is
also useful for extraction to ensure that a document does not read information from one
field into another (on some machines), as the space could be used to "delimit"
the field and assist the extraction process in finding the proper start of a field which
occurs to the left of a field which did not read properly.
The Panini compresses all spaces out of its OCR data, so this technique cannot be used to
extract discrete fields. On the 7731 however, this technique can be used to keep a misread
field from affecting the other extracted fields.
When creating a document with space delimited fields for the 7731, do not use more than
one space in between the fields. One space will be accounted for by the OCR engine. More
than one space will degrade the performance of the OCR read and slow the machines
response time down.
- OCR Line Consistency: The location of the OCR line
should be consistent between all stub definitions so that the operator is not required to
change the position of the OCR read head between batches on machines that have a physical
OCR reader (i.e. Panini).
On machines that perform OCR from the image (like the 7731), it is important for
performance reasons to locate all of the OCR zones in the same place. On the 7731, this
includes the MICR line, as it is read as an E13B OCR line. If the zone for each stub is
located in the same place as the zone for the check, then the 7731 can use the same
grayscale image data for every OCR process. The grayscale data has the best read rate.
If the OCR zone is located in a different position than the MICR line of the check, then
the Flex software will attempt to create one zone that will incorporate both of the
individual zones. The OCR zone used in the 7731 hardware has a maximum area of about 6 in2.
If both zones cannot be combined into a zone of this size or smaller, the software will
create two zones instead. The tradeoff in doing this is that the second zone (used for
MICR in Flex) must be taken from the bi-level (black and white) image that was lifted.
This black and white image will provide a slightly lower read rate. The same goes for
multiple stub OCR line definitions. They should be kept in the same zone for best results.
2. ICR Considerations
ICR reading rates are still only marginally cost-justifiable, and so every consideration
should be taken to provide for the best possible read rates. We recommend that all ICR
customers have their documents test printed (if applicable) so that as many issues as
possible are resolved before the site goes live with new documents.
A normal document read rate that would be considered good
is probably around 60-70% doing a meter read type application as shown on the example
stub. As these read rates are achieved during a batch process, saving the operator from keying 30-40% of the meter readings is a time savings that can be justified. Some stub layouts
have already been set up for ICR, so if your application matches a previously set up
configuration, it is recommended that you mimic that layout. Here are some of the issues
to keep in mind as you design the ICR portion of your stub:
- Character box shading: The location of the
handwriting needs to be consistent for proper recognition. To ensure that the data is
consistently placed, shaded boxes are provided. The shading should be done using
"no-repro" blue or green "drop-out" ink, which will cause the shaded
box to be invisible to the image camera. They should also not be printed solid, but be printed at a
100 and 50 screen. This will help keep an over-sensitive camera from seeing the ink even
if it is within specifications.
- Location of the ICR boxes: The boxes should have a
clear band around them of at least 1/8th inch. The clearer the area surrounding
the boxes, the better the chance of reading handwriting. The boxes often are not filled in
properly, and if the handwriting strays out of the box, it can still be read if the area
around it is the clear. The example we have above does not follow these rule for the
"Amount Enclosed" text printed above the ICR boxes, and the read rates have
suffered slightly because of this.
- Spacing and Size: The boxes should be 2/10th
of an inch square. The spacing between the boxes should be 1/10th inch.
- ICR-Related Data: There are some times when
the information filled in the ICR field will not be valid. In these cases, it is helpful
to add some data to the OCR line to indicate that ICR is not to be performed. In the cases
where the ICR data is a meter reading (as is the case above), the bill may be of a special
type, where a meter reading is not desired, though the shaded boxes appear on the standard
form. The meter reading may not be for the current month, so the billing month could be
included on the OCR line for verification. These special cases should be considered while
creating the bill so that the OCR line can be designed to include special (numeric) flags
that will assist in getting the best performance possible.
- ICR instructions: Information should be included somewhere on the bill on the
importance of properly filling in the ICR boxes. An example of properly filled in boxes
would also be helpful.
3. ICR Registration Mark
- Importance of registration mark: When performing the ICR
process, the software needs accurate alignment of the ICR boxes to find and interpret the
handwriting properly. To provide the best possible read rates, a registration mark printed
in black ink should be provided in perfect registration with the ICR box shading. Without
this registration mark, the ICR process will suffer a large read rate decline. A loss of
20-30% could occur.
- Type of mark: The registration mark should be a crossbar
made of a vertical and horizontal line, or a corner (as the example above uses). This
allows horizontal and vertical adjustment to be done of the image in the digital realm
before processing the handwriting. The line should be of medium thickness, enough to
provide a clear black line when imaged. Since we generally image documents at 200 DPI, the
thickness should be probably about 5-8 pixels wide, making it about .025 to .040 inch
wide. Outward from the corner of intersection of these lines, there should be no other
printed information for at least 3/8th inch.
- Preprinted Forms: When using drop out ink, a
preprinted form is often used to eliminate the need for a two color laser production
printer. When the document is pre-printed with the ICR boxes and customer logo, etc. the
registration mark should be also be printed in black ink with careful attention to the
registration of the print of each color. It is critical to keep the relative position of
the ICR boxes to the registration mark perfectly consistent.
- Multiple Registration Marks: There are times
when it can be helpful to provide more than one registration mark. When this is done,
usually two marks are made one each at opposite corners of the area where ICR is
being performed. This allows the software to provide scaling and rotational correction of
the image. In the current line of transports supported by Flex, this is not a large
concern, as scanning is nearly always a flat and level operation, and the image pixel
density is relatively constant. Scaling would not be a concern. If the installer sees a
scanning product in the future for their customer that may provide a less reliable image
and may need these types of correction, plan up-front for registration marks at the
top-left and bottom-right (or similar) so that we are ready to implement the new hardware
down the road with little or no changes to the customers document(s).
|