OCR

Perhaps there is a very easy solution and I am simply missing something (at least I hope so).

Invariably, we have customers that come in with forms/menus/documents that they want edited and reproduced, but do not have it on file anymore.

Although I am I relatively quick typist OCR appeals to me in more that one way (it is still faster, and arguably more accurate e.g., word omissions and such).

 

Anyways, my issue is importing OCR scans into (or exporting to) Corel Draw. We have been using ABBYY Fine Reader v9.0 (tried 10 but it gave me a few problems).

Saving scans to a PDF never seems to work well, because Corel seems to be incapable of importing PDFs without screwing things up, regardless of whether I import text as "text" or "curves," the general formatting causes the file(s) to be worthless due to widespread the amout of reformatting necessary.

Creating a M$Word document poses serious import issues with Corel Draw as well (although if copy/pasted in small sections into Corel can work on smaller jobs).

And of course, less sophisticated formats (e.g., rich text format) do not help as far as formatting goes.

My Question:

Is there a program/macro/setting that will create a Corel Draw ready OCR scan when I do not have to reformat just about everything?

Thank you.

Nick

 

Parents Reply
  • harryLondon said:

    One thing to be very careful about when using free ocr. often, the character recognition is relatively poor, but is enhanced by comparing the results with common dictionary words to decide what a fuzzy character within the word is most likely to be.

    This works well for paragraph text but is likely to give errors in numeric data that appears withing the text because a 6 may have only a few pixels difference from an 8 and there is no equivalent way to decide if one is more likely than the other.

    People are often fooled seeing the apparent accuracy of the recognised text into assuming that the result is a faithful reproduction of the document, but in practice text from OCR needs to be proofread even more carefully than typed-in text.

    I agree, the accurate of ocr can't be 100%, all the ocr software may have this problem.

Children
No Data