maNipravALa (sanskrit + tamil) works are really hard to OCR properly. Google oft yields junk for minor-script strings. Would you have a good solution? Example files - (devanAgarI + tamil in this case) - https://sendgb.com/1Tnbfjq2NIsAnother problem is with grantha script texts (there is a treasure trove of those) - no OCR works well. Interested in solving that as well?
--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sanskrit-programmers/CAFY6qgEp%2BH-ubHjf7yTy7m4AvsMWoSqCERY5o%3D7_GouZj5EuEA%40mail.gmail.com.
Are you asking from a cost perspective? How many tokens does it use per page on average? If it is on the order of 1k-10k tokens per page, that would translate to about 100-1k pages/1M tokens = ~$3.5 combining input and output. So that would be < $35 for a thousand pages. Perhaps it'll be possible to find someone to sponsor the funds needed for such a project?
Avinash--On Mon, Feb 23, 2026 at 6:09 AM विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com> wrote:Gemini flash latest does a good job with this prompt accompanying an image -give me the mixed script text here, exactly as it appears
How to run this for thousands of pages?On Thu, 17 Oct 2024 at 10:14, विश्वासो वासुकिजः (Vishvas Vasuki) <vishvas...@gmail.com> wrote:maNipravALa (sanskrit + tamil) works are really hard to OCR properly. Google oft yields junk for minor-script strings. Would you have a good solution? Example files - (devanAgarI + tamil in this case) - https://sendgb.com/1Tnbfjq2NIsAnother problem is with grantha script texts (there is a treasure trove of those) - no OCR works well. Interested in solving that as well?--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sanskrit-programmers/CAFY6qgEp%2BH-ubHjf7yTy7m4AvsMWoSqCERY5o%3D7_GouZj5EuEA%40mail.gmail.com.
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-program...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sanskrit-programmers/CAALtx9bHnbemhnfNud%2B6G_nmp1zeukYLpGwiUGjzi4Y8OxKjwQ%40mail.gmail.com.
Hi Vishvas,I would like to try processing these files with a workflow I recently developed. I can't access the link you provided: https://sendgb.com/1Tnbfjq2NIs. Do you have another link where I can download the files? I will test them and share the results with you.