Text recognition (OCR)

Avatar Richard I
In 'out of the box' Mac OS or IOS is there a text recognition facility?

I wish to convert scans of pages of text to a form that can be edited and used in the Pages app.

If not, does anyone use and therefore recommend an app to do it?

Re: Text recognition (OCR)

Avatar Euan Williams
You may find LEADTOOLS OCR App app really useful. It's free in basic use format. I have used it for many years, lately to scan 500 pages of typescript from the 1950s which it did fast and reasonably accurately.

Remember that success does depend on the resolution at which you scan the document. 300dpi is a minimum, and sometimes you may have to scan at a resolution as high as 750dpi or higher. Only experiments on your particular document will tell you what to use. There is a trade-off between time to scan and time to clean up the result.

When you open the App you will see a plain text description page scan. Use this to learn the App.

It has a slightly eccentric interface, but the basic workflow is, from the top menu bar:

1. "Open Document"

2. "Draw Zone" [select the area of the page you want to recognize]

3. "Recognize" [does the OCR and puts it in the Page Text box; bottom of the screen]

4. click once in that text window and press cmd-A to select all the text

5. copy the text and paste into Text Edit (or Pages, but Text Edit is really good at 'cleaning up' text without worrying about formatting) and Save.

6. In the top menu bar "Zones > Clear all Zones"

7. Delete the text in the bottom box.

Text Edit has a range of keyboard modifiers which move the cursor to the beginning and end of lines, forward and back between words, deleting character returns, etc. Look in the Help files for these, they will speed up your cleaning operation wonderfully if you can learn at least one or two.

If you have a lot of pages to scan dont forget to insert numbered markers in your Text Edit text so you can easily check the OCR text from the original.

Re: Text recognition (OCR)

Avatar Richard I
Thanks Euan, I will investigate...