Discuss: MIT vs GPL for Ambuda/Vidyut code?

73 views
Skip to first unread message

Arun Prasad

unread,
Dec 29, 2023, 11:47:20 PM12/29/23
to ambuda-discuss
Historically, I've opted for an MIT license for any programs I write that relate to Sanskrit. This is because anyone who might want to use that work commercially would still be promoting access to Sanskrit in some fashion, even if it's a proprietary one.

However, I'm increasingly uneasy about the implications of making this code available to modern AI systems. Roughly, my concern is that I might be enabling further concentration of power and contributing to tools that have no real accountability to human stakeholders.

I'm not proposing relicensing or anything similar at this stage. I'm simply raising this concern to the group for feedback to help decide what makes sense. Based on the discussion here, courses of action might be:

- Keeping things as-is.
- Licensing new code under a non-permissive license (e.g. GPL)
- Re-licensing old code
- Potentially, moving off GitHub.

Arun

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Dec 30, 2023, 12:32:15 AM12/30/23
to Arun Prasad, ambuda-discuss
On Sat, 30 Dec 2023 at 10:17, Arun Prasad <aru...@gmail.com> wrote:
Historically, I've opted for an MIT license for any programs I write that relate to Sanskrit. This is because anyone who might want to use that work commercially would still be promoting access to Sanskrit in some fashion, even if it's a proprietary one.

However, I'm increasingly uneasy about the implications of making this code available to modern AI systems. Roughly, my concern is that I might be enabling further concentration of power and contributing to tools that have no real accountability to human stakeholders.

Explicitly exclude use in AI programs?


I'm not proposing relicensing or anything similar at this stage. I'm simply raising this concern to the group for feedback to help decide what makes sense. Based on the discussion here, courses of action might be:

- Keeping things as-is.
- Licensing new code under a non-permissive license (e.g. GPL)
- Re-licensing old code
- Potentially, moving off GitHub.

Arun

--
You received this message because you are subscribed to the Google Groups "ambuda-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ambuda-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ambuda-discuss/72001be2-c514-42e7-8536-8b4adae3d45bn%40googlegroups.com.


--
--
Vishvas /विश्वासः

विश्वासो वासुकिजः (Vishvas Vasuki)

unread,
Dec 30, 2023, 3:03:23 AM12/30/23
to Arun Prasad, ambuda-discuss
It's very common for users to ask an LLM these days:

"How do I use library X to do Y?"

and get good answers.

So barring AI might hurt humans more than help

Shreevatsa R

unread,
Dec 30, 2023, 10:18:39 AM12/30/23
to विश्वासो वासुकिजः (Vishvas Vasuki), Arun Prasad, ambuda-discuss
  • Right now, if you ask ChatGPT to explain a Sanskrit verse, it often does a decent job, and this is not based on any code, just patterns in the data (Sanskrit text). In a way, the "code" approach of Vidyut is the opposite of modern AI systems: deterministic vs stochastic. So I'm afraid the code may actually not be of much interest to AI systems. (I would actually hope, for the sake of accuracy, that something more deterministic/exact/correct someday becomes more a part of AI systems, but in the case of niche long-tail interests like Sanskrit it may be a while.)
  • If the code or its documentation does become part of the training so that users get suggested to use this code/library for a Sanskrit need of theirs, this may actually be a good thing. (What Vishvas said.)
  • The licence may not matter much anyway, because currently it appears the major AI systems are all trained on basically the entire available internet, regardless of licensing or copyright status.
More broadly, I think "concentration of power and […] no real accountability to human stakeholders" are unfortunately just the way of the world/capitalism, and nearly all activity / participation in the world contributes to that in some way or the other. In the 1990s and early 2000s the GPL was a way to release code to the world without contributing to "capitalism" (software companies still sold software), but it already became less relevant once software started to be put behind a web UI, and AGPL, the answer to that, never took off in a similar way. I don't know what the answer will be in the age of large large language models, but the difference between the MIT and GPL licence is unlikely to be it, IMO.

Pradyumna

unread,
Dec 31, 2023, 1:39:57 AM12/31/23
to ambuda-discuss
A combination of Re-Licensing and Moving off GitHub could help mitigate the problem but it may not be worth it.
  1. Moving off GitHub will harm visibility and unless everyone who uses the code declares how they will do so, you can never really be sure how it is being used.
  2. Changing the license might make more sense if you want to ensure more contributions by forcing derivative works to be free but presently it will NOT stop people from using the code to train their models. (like Shreevatsa said) Many people have already found Github Co-Pilot to regurgitate GPL - Licensed code. ( for example: https://twitter.com/DocSparse/status/1581461734665367554 )
Maybe taking some action after this issue stops being a `legal gray-area` might be better.

Ashwin Ramaswami

unread,
Dec 31, 2023, 5:50:50 PM12/31/23
to ambuda-discuss
> More broadly, I think "concentration of power and […] no real accountability to human stakeholders" are unfortunately just the way of the world/capitalism, and nearly all activity / participation in the world contributes to that in some way or the other.

I agree. For example, making a website indexable by Google also contributes to concentration of power, but is something that we inevitably end up doing.

Reply all
Reply to author
Forward
0 new messages