Need for database of Sanskrit books

57 views
Skip to first unread message

Anunad Singh

unread,
Feb 27, 2017, 1:39:28 AM2/27/17
to sanskrit-p...@googlegroups.com
Number Sanskrit languages books is estimated to be in crores. (The national mission for manuscripts in New Delhi has estimated to have 2-3 crore manuscripts.) . This large number suggests that we should have a good database of Sanskrit books in this digital age. As far as I know, we have only short lists of books with their authors with entries not more than a thousand or so.

We should start to compile a database of Sanskrit books containing at least four-five types of data field such as name of the book, author, probable date or period of composition, major subjects dealt in the book, comments etc.

We should start it and aim to have at least 1 lakh entries in it. I feel that even a single person can compile 1000 entries in a week easily. But it will be better if this task is taken up by Rashtriya Sanskrit Sansthan (who already have compiled a list of Sanskrit theses) or some other Sanskrit institute.

-- anunaada Singh

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Feb 27, 2017, 10:34:46 AM2/27/17
to sanskrit-programmers
साधु चिन्तितम्। सन्ति http://www.worldcat.org/title/new-catalogus-catalogorum-an-alphabetical-register-of-sanskrit-and-allied-works-and-authors/oclc/10393298 , Aufrecht's Catalogus Catalogorum इत्याभ्यां सदृशा ग्रन्थाः, यत्राऽस्य मूलम् भवितुमर्हति।

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
--
Vishvas /विश्वासः

dhaval patel

unread,
Feb 27, 2017, 11:03:22 AM2/27/17
to sanskrit-p...@googlegroups.com
Best start point will be the largest available database of such sort i.e. NMM.


Data till 2015 end or so. 

This holds all preliminary info you may need. Properly arranged. No need to reinvent the wheel. Just decorate it.

Anunad Singh

unread,
Feb 28, 2017, 12:05:21 AM2/28/17
to sanskrit-p...@googlegroups.com
Dhaval ji,
I find that first two 'big files' at https://archive.org/details/NationalManuscriptMission  have info only about who was in possession of the book and his location. The name of the book and other details, as we need, is not contained.

- anunaada

--

Anunad Singh

unread,
Mar 1, 2017, 12:43:34 AM3/1/17
to sanskrit-p...@googlegroups.com
The project can be started immeditely with a 'seed' of about 2000 entries mostly available online for copy and paste. For example, the following-


http://sanskritdocuments.org/doc_z_misc_misc/sanskritworksDev.html?lang=sa


विभिन्न विषयों के संस्कृत ग्रन्थ

https://hi.wikibooks.org/wiki/विभिन्न_विषयों_के_संस्कृत_ग्रन्थ

More laboursome jab will then follow taking help from printed sources like-

* Encyclopaedia of Indian Literature, By Amaresh Datta

https://books.google.co.in/books?id=zB4n3MVozbUC&printsec=frontcover#v=onepage&q&f=false

* A Companion to Sanskrit Literature: Spanning a Period of Over Three Thousand ...By Sures Chandra Banerji  

https://books.google.co.in/books?id=JkOAEdIsdUsC&printsec=frontcover#v=onepage&q&f=false


* GRETIL - Göttingen Register of Electronic Texts in Indian Languages

gretil.sub.uni-goettingen.de


* DLI resources and other resources


-- anunaada

========================================

Anunad Singh

unread,
Mar 1, 2017, 12:55:17 AM3/1/17
to sanskrit-p...@googlegroups.com
I forgot to mention the greatest of all source-

* Census of the Exact Sciences in Sanskrit, By David Edwin Pingree , in 4 volumes


Volume 4

https://books.google.co.in/books?id=RQoNAAAAIAAJ&printsec=frontcover#v=onepage&q&f=false

dhaval patel

unread,
Mar 1, 2017, 2:30:14 AM3/1/17
to sanskrit-p...@googlegroups.com
Hi Anunad ji,
There seems to be some misunderstanding.
There are tabs in the file.
One of the details of one tab is copied here.
They have columns till AN.

A sample is attached.
This is really very great database to start with in my opinion.
I was intending to work on this, but didn't find time.
Manuscripts are also classified according to subjects too.



Fields are as below
Inst_code Record_no Title O_Title Author Commentary Commentator Scriber Lang Script Comp Sub Bund_no Manus_no Folio_no Pages Material Miss_portion Illus Cond cata_source remarks Rc_cd Ag_cd usercd date_entry manus_dt manus_era manus_size_l manus_size_w manus_size_h lang1 lang2 script1 script2 Subs1 Subs2 Vol_no Part_no Serial_no

--
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsubscrib...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Dr. Dhaval Patel, I.A.S
Collector and District Magistrate, Anand
manus_example.xlsx

Anunad Singh

unread,
Mar 1, 2017, 9:38:23 AM3/1/17
to sanskrit-p...@googlegroups.com
Sorry! I could not see the TABS. In fact, the document is quite elaborate.

-- anunaada

--
Dr. Dhaval Patel, I.A.S
Collector and District Magistrate, Anand

--

dhaval patel

unread,
Mar 1, 2017, 9:58:32 AM3/1/17
to sanskrit-p...@googlegroups.com
Some approaches for finding some method in madness in this database will be

1. Sort titlewise
2. Sort authorwise
Etc.

I have SQLs also, if needed.

Anunad Singh

unread,
Mar 2, 2017, 11:22:08 PM3/2/17
to sanskrit-p...@googlegroups.com
There seem to be more than 1 lakh entries (manuscripts) in the five files. Out of them, more than 50,000 may be Sanskrit manuscripts. There may be more than one manuscript for some of the books. This means by applying suitable filtering, we may get data of some 30,000 or so Sanskrit books.

We may apply the following sequence of filtering-

1) sort according to 'language' of the books, pick the Sanskrit books.

2) Sort the data found in (1) according to book name. If book name and author name are identical for two or more entries, only one is to be taken.

3) from the data found in (2), copy only columns such as book name, author name, period of composition, subjects dealt in it, important comments.

--anunaada

--

dhaval patel

unread,
Mar 3, 2017, 8:30:01 AM3/3/17
to sanskrit-p...@googlegroups.com


On 3 Mar 2017 09:52, "Anunad Singh" <anu...@gmail.com> wrote:
There seem to be more than 1 lakh entries (manuscripts) in the five files. Out of them, more than 50,000 may be Sanskrit manuscripts. There may be more than one manuscript for some of the books. This means by applying suitable filtering, we may get data of some 30,000 or so Sanskrit books.

The data is supposed to contain at least the following description provided by NMM in writing to me
Total catalogued mss - 35 lacs
Total digitizedss - 2 lacs

So for 35 lacs, metadata should be available. 2.11 lacs have pages or folios also digitized. 

And my experience goes that Sanskrit Mss may be 75%. 


We may apply the following sequence of filtering-

1) sort according to 'language' of the books, pick the Sanskrit books.

2) Sort the data found in (1) according to book name. If book name and author name are identical for two or more entries, only one is to be taken.

3) from the data found in (2), copy only columns such as book name, author name, period of composition, subjects dealt in it, important comments.

Seems reasonable. 
Who is volunteering to do this job?
Only addition will be transliteration normalization. Some books have details in IAST, some in Devanagari etc.


--anunaada

On Wed, Mar 1, 2017 at 8:28 PM, dhaval patel <drdhav...@gmail.com> wrote:
Some approaches for finding some method in madness in this database will be

1. Sort titlewise
2. Sort authorwise
Etc.

I have SQLs also, if needed.

--
You received this message because you are subscribed to the Google Groups "sanskrit-programmers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sanskrit-programmers+unsubscrib...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Anunad Singh

unread,
Mar 3, 2017, 9:12:00 AM3/3/17
to sanskrit-p...@googlegroups.com
Do you mean that machine readable digital record of 35 lacs manuscripts is available? Where is it?
 
The data is supposed to contain at least the following description provided by NMM in writing to me
Total catalogued mss - 35 lacs
Total digitizedss - 2 lacs

So for 35 lacs, metadata should be available. 2.11 lacs have pages or folios also digitized. 

And my experience goes that Sanskrit Mss may be 75%. 
Etc.

I can take this responsibility but I am running quite busy these days. I may not find time to concentrate for for this, and thus it may take months!

-- anunaada

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Mar 3, 2017, 12:10:40 PM3/3/17
to sanskrit-programmers

2017-03-03 6:11 GMT-08:00 Anunad Singh <anu...@gmail.com>:
I can take this responsibility but I am running quite busy these days. I may not find time to concentrate for for this, and thus it may take months!

​Nice! Please consider using a github repository and ​some simple format like csv to publish the results. That way we can use the fruits of your labour to convenience many others. and don't forget any scripts which might be of future use!

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Mar 6, 2017, 10:47:53 PM3/6/17
to sanskrit-programmers
नमस्काराः! एतदपि वीक्षध्वम् - http://www.panditproject.org/

मयैवमावेदनं प्रेषितम्-
====================
namaste!

Could you provide a copy of your entire database? As part of our very-popular opensourse sanskrit-stardict project ( https://sites.google.com/site/sanskritcode/dictionaries ), I would like to make this available as stardict dictionaries (crediting you, of course), so that connoisseurs can search offline on a variety of devices.
=====================

Anunad Singh

unread,
Mar 7, 2017, 12:26:08 AM3/7/17
to sanskrit-p...@googlegroups.com
Do they also have the digital versions of the 'texts' or only information (database) of the texts?

-- anunaada?

Reply all
Reply to author
Forward
0 new messages