DTIC ADA447997: Groundtruth Generation and Document Image pdf

DTIC ADA447997: Groundtruth Generation and Document Image_bookcover

DTIC ADA447997: Groundtruth Generation and Document Image

More Book Details

Description of the Book:

The problem of generating synthetic data for the training and evaluation of document analysis systems has been widely addressed in recent years. With the increased interest in processing multilingual sources, however, there is a tremendous need to be able to rapidly generate data in new languages and scripts, without the need to develop specialized systems. We have developed a system, which uses language support of the MS Windows operating system combined with custom print drivers to render tiff images simultaneously with windows Enhanced Metafile directives. The metafile information is parsed to generate zone, line, word, and character ground truth including location, font information and content in any language supported by Windows. The resulting images can be physically or synthetically degraded by our degradation modules, and used for training and evaluating Optical Character Recognition (OCR) systems. Our document image degradation methodology incorporates several often-encountered types of noise at the page and pixel levels.

Examples of OCR evaluation and synthetically degraded document images are given to demonstrate the effectiveness

  • Creator/s: Defense Technical Information Center
  • Date: 5/1/2005
  • Year: 2005
  • Book Topics/Themes: DTIC Archive, Zi, Gang, MARYLAND UNIV COLLEGE PARK DEPT OF COMPUTER SCIENCE, *DEGRADATION, *DOCUMENTS, METHODOLOGY, PIXELS, METADATA, OPTICAL CHARACTER RECOGNITION, OPERATING SYSTEMS(COMPUTERS), PROGRAMMING LANGUAGES

An excerpt captured from the PDF book

DTIC ADA447997: Groundtruth Generation and Document Image_book-excerpt

Report Broken Link

File Copyright Claim

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories

You might be also interested in these Books

Related Posts
PDF Viewer

الرجاء الانتظار بينما يتم تحميل الـ PDF…
HTML Popup Example