Install ToolsFirst off lets install Homebrew to simplify managing custom packages. This will keep everything in a special /usr/local/ directory so it won't interfere with OS X's normal system. To do this open up the Terminal application, copy and paste the following command, and press Return to execute it. You will be prompted to authorize the installation and may be asked to install some OS X command line tools directly from Apple.
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Next we need to install Tesseract and some supporting libraries. Again int he Terminal window enter each of these lines and press Return to execute them separately. They may take a few minutes to compleate before the command prompt returns.
brew install imagemagick brew install tesseract --all-languages
Configure AppletNext we need to create a user friendly way to do the OCR. We can easily do this with Automator in OS X. Open Automator and create a new Application. On the left search for and add Run Shell Script and Display Notification in order. In the Run Shell Script change Pass input to as arguments add the following code to that step.
PATH="$PATH:/usr/local/bin" for var in "$@" do convert "$var" -resize 400% -type Grayscale - | tesseract -l eng - - | pbcopy done
As an alternate the following code will convert the entire result to a single line of text if that is preferred, but it may cause issues if there are columns of text on the image.
PATH="$PATH:/usr/local/bin" for var in "$@" do text=`convert "$var" -resize 400% -type Grayscale - | tesseract -l eng - -` ocr="$orc$text" done echo $ocr | pbcopy
Next add some text to the notification step so you know when the task is done processing. I added a title of "ORC Finished" and a message if "Text was copied to your clipboard.". Then just save the application and give it a name.
Using the AppletTo use the applet first fine it and d rag it down to your Dock to make a shortcut. Then you can drag image files onto the application in the Dock and it will do it's magic. Once you drop an image onto the Applet it will take a few seconds to process and you should see the notification pop up when it is done in the corner. At this point you can past the resulting text into a program of your choice, clean it up, and so what you want with it.
Other things that can be done with a bit of tweaking to the above scripts:
- Processing multiple input files at once.
- Saving results to a text file on the desktop or source folder instead of the clipboard.
- Opening the resulting file automatically.
- Remove the notification step if desired.
- Create a Folder Action instead to automatically run on files added to a specified folder.
- Advanced tesseract options can be passed in the script but in my experience these were not needed.