Update on the Sanskrit texts & looking for open sources

19 views
Skip to first unread message

keshav Mishra

unread,
May 25, 2026, 10:12:19 AM (3 days ago) May 25
to ambuda-discuss
Hi Arun and everyone,

Following up on the note to Arun a few weeks ago, thanks to pointing out to GRETIL and sanskritdocuments.org, both were useful, but sadly, they don't grant permission for commercial use.

Since then, I've spent time going through 100+ platforms trying to fill in gaps in my corpus: book publishers, university digitisation projects, Internet Archive, Ministry of Culture collections, foreign archives. My conclusion: Ambuda is the best for texts that are complete, proofread, machine readable, and open-licensed at the same time. Nothing else I found will fill all 4 bars at once.

A few sources I found that might be useful for Ambuda's expansion :

- https://egangotri.org/
- https://www.sanskritebooks.org/
- https://guides.libraries.emory.edu/c.php?g=576070&p=3973627
- https://titus.uni-frankfurt.de
- https://sa.wikisource.org
- https://vedicreserve.miu.edu

My present corpus comprises the main Upanishads, Stotras, Ramcharitmanas, Ramayana, Mahabharata, Gita, and Vedas. The rest are the Puranas, the Shastras, and the Samhitas. Original Sanskrit verse only, no copyrighted translations needed. If anyone on the list knows of proofread, permissively licensed for commercial use sources for these, I'd love to hear about it.

Keshav

Ronit Kumar

unread,
May 25, 2026, 10:25:22 AM (3 days ago) May 25
to ambuda-discuss
Greetings,

The sources listed here are useful. Might I add, 
- [siva.sh] https://www.siva.sh

Also, as per my understanding of the Indian law and based on the discussions I have had with many people involved in the same field, the original Sanskrit verses belong to public domain and can not be copyrighted. 
https://www.livelaw.in/high-court/delhi-high-court/delhi-high-court-copyright-religious-scriptures-bhagwat-gita-238677

Warm regards,
Ronit


keshav Mishra

unread,
May 25, 2026, 12:44:00 PM (2 days ago) May 25
to ambuda-discuss
Thanks, Ronit,

Though for all 18 maharpuran, neither the DCS nor Siva.sh hosts a complete Sanskrit verse corpus, and the available gaps are significant.

The copyright point is true, but in the true sense, the problem isn't the Sanskrit verse but that almost every digital edition has publishers or proofreaders rights over their specific encoding and formatting. Though the public domain source is accessible, but the clean, machine readable version are tough to curate. Internet Archive scans of older editions are the fallback, but the OCR and the scan quality are so rough.

which is exactly why the Ambuda like approach matters. Happy to hear if anyone knows of a complete Mahapuran text in a usable format.

Keshav
Reply all
Reply to author
Forward
0 new messages