Printer Friendly

Data base boosts handwriting analysis use.

Federal researchers have created a data base of handwritten characters for studying or testing computerized systems that can read various handwriting styles.

Developed at the National Institute of Standards and Technology (NIST), the data base is well suited for educational uses and is an important tool in developing handwriting recognition systems for banks, insurance companies, and other form-processing organizations.

Currently, some machines can identify characters printed by other machines and some handwriting if the writer has carefully formed each character. Still to come, however, are systems that can accurately read unconstrained script. Such systems must learn to recognize individual writing patterns. NIST's data base, with its immense statistical sampling of handwriting samples, can help.

The new compilation upgrades a two-year-old NIST data base that included more than one million handprinted characters. Those characters, however, were contained within more than 2,000 pages of survey forms filled out by volunteers to sample a spectrum of writing techniques. Users studying individual characters had to isolate these letters and numbers from the data base. This required time and technical sophistication.

In the new data base, known as NIST Special Database 3, character isolation is done for the user. This broadens the data base's use as a training tool, says NIST researcher Michael Garris, who developed Special Database 3 with R. Allen Wilkinson.

"Special Database 3 will be much more usable to the novice in this field than [the earlier data base] was," Garris says. "Universities, for example, that are trying small projects such as algorithm development will find Special Database 3 valuable."

NIST acquired Special Database 3's handwriting specimens by asking more than 2,000 US Census Bureau employees across the country to fill out detailed forms that sampled individual writing styles. Besides writing strings of letters and numbers, volunteers copied the preamble to the Constitution.

The resulting data base contains 313,389 different character images, each isolated into 128 by 128 pixel areas. The characters break down into three groups: 223,125 digits; 44,951 upper-case letters; and 45,313 lower-case letters.

Special Database 3 is available from NIST on compact disk read only memory (CD-ROM). System requirements are a 5 1/4 inch CD-ROM drive with software to read ISO 9660 format. The cost is $895. For information, write Standard Reference Data Program, A323 Physics Building, NIST, Gaithersburg, MD 20899.
COPYRIGHT 1992 American Society for Industrial Security
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1992 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Security Spotlight
Publication:Security Management
Date:Aug 1, 1992
Previous Article:Keeping computers safe and healthy.
Next Article:Computer security - technology and the tablets.

Related Articles
A script for screening.
When evidence is on the line.
She can see write through you.
"Inking" out of the box: manufacturers of pricey tablet PCs bet digital ink will go mainstream.
Document analysis and recognition; proceedings; 2v.

Terms of use | Copyright © 2017 Farlex, Inc. | Feedback | For webmasters