OCR with 'google docs upload file_name'

55 views
Skip to first unread message

Sandeep K Chaudhary

unread,
Sep 23, 2013, 10:44:45 PM9/23/13
to googlecl...@googlegroups.com
Hi guys,

I am trying to be able to use the Google docs' OCR feature to a file by uploading it using googlecl. However, when I attempt this (even with the 'convert text from uploaded pdf and text files' option selected), the uploaded PDF file doesn't get converted to text. I have tried to do the same using the browser interface, and the OCR works perfectly with the aforesaid option. Can someone please tell me how I can achieve this OCR feature while using the googlecl command line utilities? I am trying to automate conversion of some PDF files (which contain a subtantial number of images with embedded text) to text form using Google docs's OCR which is far better (in my experience) than other available OCR tool such as tesseract. Please help me with this.

Thanks and regards,
Sandeep.

Sandeep K Chaudhary

unread,
Sep 25, 2013, 3:28:53 AM9/25/13
to googlecl...@googlegroups.com
Can someone please reply? This is urgent and some quick suggestions would be of great help. Thanks !

Ellie K

unread,
Sep 25, 2013, 4:02:49 PM9/25/13
to googlecl...@googlegroups.com
Sandeep,
This might be useless, as it is old. There were problems in the past with using the Google Docs OCR to upload PDf (and other file types) and get them converted to text, using Google Command line. It was reported as an error in December 2010, as Issue 332 for Google CL project.(Enable OCR for Docs upload):
google docs upload does not allow for Docs to apply OCR on the uploaded (scanned) document (either PDF, PNG, JPG)... but it works in the browser

The issue wasn't ever resolved or closed in the Google Cl project, not so as I can see. Take a look at this Google Cl group discussion by someone else who was having the same problem (I think) as you are with OCR, He asked for help, then figured it out himself and posted his answer in the linked discussion in my previous sentence. The solution seems unlikely because it is so simple, but maybe it is valid.

I hope this isn't a waste of your time. I used to play with (and helped, just a little bit) test GoogleCL in 2011, though I only come back to visit now.

~ Ellie

On Wednesday, September 25, 2013 12:28:53 AM UTC-7, Sandeep K Chaudhary wrote:
Can someone please reply? This is urgent and some quick suggestions would be of great help. Thanks !
On Monday, September 23, 2013 10:44:45 PM UTC-4, Sandeep K Chaudhary wrote:
I am trying to be able to use the Google docs' OCR feature to a file by uploading it using googlecl. However, when I attempt this (even with the 'convert text from uploaded pdf and text files' option selected), the uploaded PDF file doesn't get converted to text. I have tried to do the same using the browser interface, and the OCR works perfectly with the aforesaid option. Can someone please tell me how I can achieve this OCR feature while using the googlecl command line utilities? I am trying to automate conversion of some PDF files (which contain a subtantial number of images with embedded text) to text form using Google docs's OCR which is far better (in my experience) than other available OCR tool such as tesseract. Please help me with this....

Sandeep K Chaudhary

unread,
Sep 25, 2013, 11:31:38 PM9/25/13
to googlecl...@googlegroups.com
Thanks a lot for the reply, Ellie ! 

I will try the changes mentioned in the Google Cl group discussion post. And it's definitely not going to be waste of time. :) I will post the outcome of the changes on this post.

Thanks and regards,
Sandeep.
Reply all
Reply to author
Forward
0 new messages