Using Google for OCR

Amit Agarwal has posted a tip on his blog about using Google to convert PDF to text.  For some reason, he suggest putting all your PDFs documents on the web:

Create a folder in your website (say and upload all the PDF images to that folder. Now create a public web page that links to all the PDF files. Wait for the Google bots to spider your stuff.

Once done, type the query “ filetype:pdf” to see the PDF documents as HTML.

Why would you want your documents to be accessible by anyone? Why wait for Google to index your page?

There’s a much easier way I’ve been using, and one of the commentators on Agawal’s blog points it out:

You can upload the Scanned PDFs to Gmail and sent it you only. Then Open your Inbox and the mail sent from you, you have an option to View as HTML. That will solve the Hosting problem.

One response to “Using Google for OCR”

  1. Cool tip. Thanks

Leave a Reply

%d bloggers like this: