Fwd: PDF Summarizer for Indian Texts using AI

18 views
Skip to first unread message

विश्वासो वासुकेयः

unread,
Jul 9, 2024, 10:22:13 AM (13 days ago) Jul 9
to sanskrit-programmers


---------- Forwarded message ---------
From: eGangotri Digital Preservation Trust <Unknown>
Date: Monday 8 July, 2024 at 7:52:39 am UTC+5:30
Subject: PDF Summarizer for Indian Texts using AI
To: भारतीयविद्वत्परिषत् <Unknown>


Dear All.

As you know at eGangotri we digitize Sanskrit Books which can be in not just Devanagari but Kannada, Telugu etc and books may not necessarily be in Sanskrit.

Currently for creation of metadata we have to go the tedious analog way of opening a pdf , identifying cover pages and then typing in Roman. Sometimes Google Lens.

Is there any tool/turnkey-suite - commercial products welcome -  that can look up at a folder of say 100 image pdfs (as we work with scans of physical books in pre-OCR) and generate a Excel using AI with some details such as:

Title of PDF
Title of Text
Author of Text
Year of Publication
Publisher
Subject
Summary

The automation includes OCR and other pipelines.

If such a AI Based tool exists in the market for Indian Languages. then pls share some pointers.

If none exists than as a nodejs/react developer I could attempt to create one  myself - any pointers will be helpful.





Reply all
Reply to author
Forward
0 new messages