I thought the same but changed my mind.
- Digitizing all Sanskrit texts means a large and stable group of proofers that can train and recruit new members, simplify our workflows, and suggest improvements. Scale is important due to network effects, and proofreaders are not fungible due to differences in motivation and interest. So for me, the more the better. (That is, I am not choosing between 100% of capacity on new texts and 100% on old; I am choosing between 100% capacity on new and and additional 100% capacity on old.)
- The nature of proofing is changing, especially as OCR and automated tools become better. For example, a proofer could scan in a PDF, ask $LLM to reconcile errors against a specific scraped version, review differences, and publish the text. This is not costly.
- Ambuda's proofed texts are tied to a specific printed source
version, down to specific pages and (maybe) specific lines. Scraped
texts don't do this and often lack source information, so they can't be
used as-is.
- Duplication also uncovers new errors. For example, I found several severe errors in various shAnkara stotras online. This is a kind of double keyed transcription and reduces errors for critical texts.
- I am willing to pay the cost of duplication at the margin if it means I can help people contribute to the vision in any capacity.
Arun