codebase licensing advice for software configuration files

60 views
Skip to first unread message

Robbie Morrison

unread,
Feb 9, 2023, 5:17:28 AM2/9/23
to openmod list

Hello all

I am circulating some recent advice from an open‑source lawyer (who I cannot name due to the Chatham House rule) on the open licensing of software configuration files.  The term "work" here is legal jargon for an artifact potentially subject to copyright and allied rights protection.

"I have provided guidance to projects to explicitly not publish configuration files as part of the work but as separate works and to treat this separate works as data that is read into the covered work at runtime"

So that means that, for instance, a python‑based snakemake file should be released under say a Creative Commons CC‑BY‑4.0 content license and not under your main codebase software license (which may be anything from MIT to AGPL‑3.0‑or‑later).  Such files may be in the same repo though — while noting that some projects run distinct repos for their core code and for their associated workflow processing pipelines (for example, PyPSA and PyPSA‑Earth, respectively).

The Linux Foundation maintain a list of SPDX pubic license identifiers:

Only Open Source Initiative (OSI) approved software licenses should be applied to source code.

If anyone has experiences to share, please do so.

kia ora, Robbie

-- 
Robbie Morrison
Address: Schillerstrasse 85, 10627 Berlin, Germany
Phone: +49.30.612-87617

Robbie Morrison

unread,
Feb 21, 2023, 5:30:26 AM2/21/23
to openmod-i...@googlegroups.com

Hello all

This discussion is quite technical and probably of limited interest.

First thanks to those (KG,JH,TN) who replied to me directly and pointed out that snakemake would class as code. My experience ended with classic make, which simply articulates sets of rules and would likely be regarded as data.

Some more backstory. The discussion is directed to the AGPL‑3.0‑* license. That license often applies to projects developing code that could be run from remote servers and based on software‑as‑a‑service (SaaS) architectures. If you don't deploy that particular license, then you can probably stop reading.

The AGPL‑3.0‑* requires operators to convey their corresponding source, also any modifications, on user request. That same license defines "installation information" as "any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work" and requires that be provided too. So questions arise as what to do when your configuration information contains security‑related and/or private details. The open source lawyers I know would naturally resist providing those details on request, of course.

One option is to carve out those details to dedicated data files and treat them as a separate work — thereby avoiding the need to reveal them on request under the terms of the AGPL‑3.0‑*. And then license them or not as you wish.

To my knowledge, just SIREN, open_plan, and (possibly) energyRt (from EDF) use an AGPL. With one other data pipeline project considering moving to that license mid this year too.

sorry for any confusion, with best wishes, Robbie

--
You received this message because you are subscribed to the Google Groups "openmod initiative" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openmod-initiat...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/openmod-initiative/aafdaf63-2dcf-d221-2674-95f9277efd04%40posteo.de.

Joseph DeCarolis

unread,
Mar 7, 2023, 9:25:24 PM3/7/23
to Robbie Morrison, openmod-i...@googlegroups.com
Hi Robbie,

Thanks for sharing this info, which raises a related question: is there best practice guidance regarding how to deal with a single GitHub repo that requires multiple license files? A single model repo containing source code, data, and configuration files may potentially require three different license files. I found this StackExchange thread, but I'm not sure there's a clear answer. This issue is particularly acute for legacy models that mixed file types -- not to mention code and data in the same file -- in ways that are now hard to separate.

Best,
Joe



--
Joseph F. DeCarolis
Professor
915 Partners Way
Department of Civil, Construction, and Environmental Engineering
North Carolina State University
Campus Box 7908
Raleigh, NC 27695-7908

Phone: 919-515-0480
Fax: 919-515-7908
E-mail: jdeca...@ncsu.edu
Web page: https://www.ccee.ncsu.edu/people/jfdecaro
Twitter: @jfdecarolis

Johannes Hampp

unread,
Mar 8, 2023, 3:29:43 AM3/8/23
to Joseph DeCarolis, Robbie Morrison, openmod-i...@googlegroups.com
Hi Joe,

We do have mixed licences for our repos, e.g. PyPSA-EUR:

https://github.com/PyPSA/pypsa-eur/

* MIT: Code
* CC-BY-4.0: Data / doc
* CC0-1.0: Configuration files / auxilliary GitHub markup

The repo is REUSE compliant and help you as an example.

I'm not sure if that's best practice. Maybe Robbie can comment on that?


Best,
Johannes


Am 08/03/2023 um 03:25 schrieb 'Joseph DeCarolis' via openmod initiative:
> Hi Robbie,
>
> Thanks for sharing this info, which raises a related question: is there
> best practice guidance regarding how to deal with a single GitHub repo
> that requires multiple license files? A single model repo containing
> source code, data, and configuration files may potentially require three
> different license files. I found this StackExchange thread
> <https://softwareengineering.stackexchange.com/questions/304874/declaring-multiple-licences-in-a-github-project>, but I'm not sure there's a clear answer. This issue is particularly acute for legacy models that mixed file types -- not to mention code and data in the same file -- in ways that are now hard to separate.
>
> Best,
> Joe
>
> On Tue, Feb 21, 2023 at 5:30 AM Robbie Morrison
> <robbie....@posteo.de <mailto:robbie....@posteo.de>> wrote:
>
> Hello all
>
> This discussion is quite technical and probably of limited interest.
>
> First thanks to those (KG,JH,TN) who replied to me directly and
> pointed out that snakemake would class as code. My experience ended
> with classic make, which simply articulates sets of rules and would
> likely be regarded as data.
>
> Some more backstory. The discussion is directed to the
> *|AGPL‑3.0‑*|* license. That license often applies to projects
> developing code that could be run from remote servers and based on
> software‑as‑a‑service (SaaS) architectures. If you don't deploy that
> particular license, then you can probably stop reading.
>
> The |AGPL‑3.0‑*| requires operators to convey their *corresponding
> source*, also any modifications, on user request. That same license
> defines "installation information" as "any methods, procedures,
> authorization keys, or other information required to install and
> execute modified versions of a covered work" and requires that be
> provided too. So questions arise as what to do when your
> configuration information contains security‑related and/or private
> details. The open source lawyers I know would naturally resist
> providing those details on request, of course.
>
> One option is to carve out those details to dedicated data files and
> treat them as a *separate work* — thereby avoiding the need to
> reveal them on request under the terms of the |AGPL‑3.0‑*|. And then
> license them or not as you wish.
>
> To my knowledge, just SIREN
> <https://en.wikipedia.org/wiki/Open_energy_system_models#SIREN>,
> open_plan, and (possibly) energyRt (from EDF
> <https://en.wikipedia.org/wiki/Environmental_Defense_Fund>) use an
> AGPL. With one other data pipeline project considering moving to
> that license mid this year too.
>
> sorry for any confusion, with best wishes, Robbie
>
> On 09/02/2023 11.17, Robbie Morrison wrote:
>>
>> Hello all
>>
>> I am circulating some recent advice from an open‑source lawyer
>> (who I cannot name due to the Chatham House rule) on the *open
>> licensing of software configuration files*.  The term "work" here
>> is legal jargon for an artifact potentially subject to copyright
>> and allied rights protection.
>>
>> "I have provided guidance to projects to explicitly not
>> publish configuration files as part of the work but as
>> separate works and to treat this separate works as data that
>> is read into the covered work at runtime"
>>
>> So that means that, for instance, a python‑based snakemake file
>> should be released under say a Creative Commons CC‑BY‑4.0 content
>> license and not under your main codebase software license (which
>> may be anything from MIT to AGPL‑3.0‑or‑later).  Such files may be
>> in the same repo though — while noting that some projects run
>> distinct repos for their core code and for their associated
>> workflow processing pipelines (for example, PyPSA and PyPSA‑Earth,
>> respectively).
>>
>> The Linux Foundation maintain a list of *SPDX pubic license
>> identifiers*:
>>
>> * https://spdx.org/licenses/ <https://spdx.org/licenses/>
>>
>> Only Open Source Initiative (OSI) approved software licenses
>> should be applied to source code.
>>
>> If anyone has experiences to share, please do so.
>>
>> kia ora, Robbie
>>
>> --
>> Robbie Morrison
>> Address: Schillerstrasse 85, 10627 Berlin, Germany
>> Phone: +49.30.612-87617
>> --
>> You received this message because you are subscribed to the Google
>> Groups "openmod initiative" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to openmod-initiat...@googlegroups.com
>> <mailto:openmod-initiat...@googlegroups.com>.
>> To view this discussion on the web, visit
>> https://groups.google.com/d/msgid/openmod-initiative/aafdaf63-2dcf-d221-2674-95f9277efd04%40posteo.de <https://groups.google.com/d/msgid/openmod-initiative/aafdaf63-2dcf-d221-2674-95f9277efd04%40posteo.de?utm_medium=email&utm_source=footer>.
>
> --
> Robbie Morrison
> Address: Schillerstrasse 85, 10627 Berlin, Germany
> Phone: +49.30.612-87617
>
> --
> You received this message because you are subscribed to the Google
> Groups "openmod initiative" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to openmod-initiat...@googlegroups.com
> <mailto:openmod-initiat...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/openmod-initiative/74190811-d705-aa5b-264d-d139bfed5f6a%40posteo.de <https://groups.google.com/d/msgid/openmod-initiative/74190811-d705-aa5b-264d-d139bfed5f6a%40posteo.de?utm_medium=email&utm_source=footer>.
>
>
>
> --
> Joseph F. DeCarolis
> Professor
> 915 Partners Way
> Department of Civil, Construction, and Environmental Engineering
> North Carolina State University
> Campus Box 7908
> Raleigh, NC 27695-7908
>
> Phone: 919-515-0480
> Fax: 919-515-7908
> E-mail: jdeca...@ncsu.edu <mailto:jdeca...@ncsu.edu>
> Web page: https://www.ccee.ncsu.edu/people/jfdecaro
> <https://www.ccee.ncsu.edu/people/jfdecaro>
> Twitter: @jfdecarolis
>
> --
> You received this message because you are subscribed to the Google
> Groups "openmod initiative" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to openmod-initiat...@googlegroups.com
> <mailto:openmod-initiat...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/openmod-initiative/CAKPji1UM54K%2BGXX-Kd9hywZbZq0BVyoCftdCoEdUWC4GYH%2Bfbw%40mail.gmail.com <https://groups.google.com/d/msgid/openmod-initiative/CAKPji1UM54K%2BGXX-Kd9hywZbZq0BVyoCftdCoEdUWC4GYH%2Bfbw%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Robbie Morrison

unread,
Mar 17, 2023, 3:27:16 PM3/17/23
to openmod-i...@googlegroups.com

Hi all

Johnannes is right, the REUSE system provides a good strategy in most cases. I will use the legal term "work" to describe the various entities under various licenses and authorships.

Under REUSE, the relevant licensed texts are first added to the root directory and then each text‑based file gets a license notice which points to one of the license texts just mentioned. That procedure then distinguishes each work implicitly. Those files too trivial to handle, like .gitignore, receive a Creative Commons CC0‑1.0 public domain wavier instead, something like:

# Waiver: To the extent possible under law, [Author] has waived all copyright and related or neighboring rights to this [configuration file]. http://creativecommons.org/publicdomain/zero/1.0/

The corner case of when two different works use the same license is not covered by REUSE (as best I can tell) except by resorting to SPDX-FileCopyrightText field in each file. And even then, that might not be sufficient to provide disaggregation. It also seems rather brittle as a strategy.

I am currently following up this question elsewhere and can hopefully provide an update in due course.

with best wishes, Robbie

Robbie Morrison

unread,
Mar 20, 2023, 4:22:02 AM3/20/23
to openmod-i...@googlegroups.com

Hi all

A follow‑up as promised.  The more general issue is the normally poor state of legacy code, rather than the mechanics of licensing and license notices.  One computer scientist opined thus (provided under the Chatham House rule so their name must necessary remain hidden, also with some light copy‑editing):

The problem at hand is to perform a 'due diligence' on a large code base developed over decades to sort out, for each distinct component (or we can call it 'work' if you like), the chain of rights, with the goal to open source it.

This is a huge undertaking that I'm sure a few people in this [legal network community] have unfortunately faced before — a notable former example is moving Netscape Navigator to Mozilla Firefox. Inria in France, as an example from academia, has experience with similar beasts, and the one I have in mind took two years and special tooling to sort out.

I fear that using the different declared licenses in the source code files to infer where they belong may seem a nice idea, but in general it's far from sufficient, and it will be a rough ride to get to the end. 

Since the due diligence work will be expensive to do, one could suggest to take this occasion to check whether the result of the process may be to get a single 'work' under a single choice of license, to make it simpler to move forward once this dish of spaghetti is released, instead of finding tricky ways of keeping inside a single repository different 'works' and tracking their rights with some extra metadata here and there.

The REUSE project does not support two or more components (or works to use the legal concept) under the same license type.  I guess that project aims to tread some line between simplicity and functionality.  But REUSE does support code under say MIT or Apache‑2.0, documentation under say CC‑BY‑4.0, and configuration files under say CC0‑1.0, but nothing more complex.  As Johannes indicated for the PyPSA project.

A corollary would be: when dealing with new or recent projects, plan your software architectures and open licensing strategies with considerable care and then maintain discipline!

with best wishes, Robbie

To unsubscribe from this group and stop receiving emails from it, send an email to openmod-initiat...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/openmod-initiative/76e0071e-4859-9c20-0f38-9e3a9d88c261%40posteo.de.

Johannes Hampp

unread,
Mar 20, 2023, 4:27:24 AM3/20/23
to Robbie Morrison, openmod-i...@googlegroups.com
Hi all,

> maintain discipline!

Tip:
CI (Continuous Integration) is a very good helper to maintain
discipline: We use a REUSE check via pre-commit in our CIs on GitHub. If
a code change would violate REUSE compliance, our CI fails. You can also
choose a setup which prevents you from merging your code in that case.


Best,
Johannes

Am 20/03/2023 um 09:21 schrieb Robbie Morrison:
> Hi all
>
> A follow‑up as promised.  The more general issue is the normally *poor
> state of legacy code*, rather than the mechanics of licensing and
> license notices.  One computer scientist opined thus (provided under the
> Chatham House rule so their name must necessary remain hidden, also with
> some light copy‑editing):
>
> The problem at hand is to perform a 'due diligence' on a large code
> base developed over decades to sort out, for each distinct component
> (or we can call it 'work' if you like), the chain of rights, with
> the goal to open source it.
>
> This is a huge undertaking that I'm sure a few people in this [legal
> network community] have unfortunately faced before — a notable
> former example is moving Netscape Navigator to Mozilla Firefox.
> Inria
> <https://en.wikipedia.org/wiki/French_Institute_for_Research_in_Computer_Science_and_Automation> in France, as an example from academia, has experience with similar beasts, and the one I have in mind took two years and special tooling to sort out.
>
> I fear that using the different declared licenses in the source code
> files to infer where they belong may seem a nice idea, but in
> general it's far from sufficient, and it will be a rough ride to get
> to the end.
>
> Since the due diligence work will be expensive to do, one could
> suggest to take this occasion to check whether the result of the
> process may be to get a single 'work' under a single choice of
> license, to make it simpler to move forward once this dish of
> spaghetti is released, instead of finding tricky ways of keeping
> inside a single repository different 'works' and tracking their
> rights with some extra metadata here and there.
>
> The REUSE project <https://reuse.software> does not support two or more
> components (or works to use the legal concept) under the same license
> type.  I guess that project aims to tread some line between simplicity
> and functionality.  But REUSE does support code under say MIT or
> Apache‑2.0, documentation under say CC‑BY‑4.0, and configuration files
> under say CC0‑1.0, but nothing more complex.  As Johannes indicated for
> the PyPSA project.
>
> A corollary would be: when dealing with *new or recent projects*, plan
> your software architectures and open licensing strategies with
> considerable care and then maintain discipline!
>
> with best wishes, Robbie
>
> On 17/03/2023 20.27, Robbie Morrison wrote:
>>
>> Hi all
>>
>> Johnannes is right, the REUSE <https://reuse.software> system provides
>> a good strategy in most cases. I will use the legal term "work" to
>> describe the various entities under various licenses and authorships.
>>
>> Under REUSE, the relevant *licensed texts* are first added to the root
>> directory and then each text‑based file gets a *license notice* which
>> points to one of the license texts just mentioned. That procedure then
>> distinguishes each work implicitly. Those files too trivial to handle,
>> like |.gitignore|, receive a Creative Commons CC0‑1.0 public domain
>> wavier instead, something like:
>>
>> # Waiver: To the extent possible under law, [Author] has waived
>> all copyright and related or neighboring rights to this
>> [configuration file].
>> http://creativecommons.org/publicdomain/zero/1.0/
>>
>> The *corner case* of when two different works use the same license is
>> /not/ covered by REUSE (as best I can tell) except by resorting to
>> https://groups.google.com/d/msgid/openmod-initiative/76e0071e-4859-9c20-0f38-9e3a9d88c261%40posteo.de <https://groups.google.com/d/msgid/openmod-initiative/76e0071e-4859-9c20-0f38-9e3a9d88c261%40posteo.de?utm_medium=email&utm_source=footer>.
>
> --
> Robbie Morrison
> Address: Schillerstrasse 85, 10627 Berlin, Germany
> Phone: +49.30.612-87617
>
> --
> You received this message because you are subscribed to the Google
> Groups "openmod initiative" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to openmod-initiat...@googlegroups.com
> <mailto:openmod-initiat...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/openmod-initiative/154cb76d-6880-13af-f90c-fd949b9300d2%40posteo.de <https://groups.google.com/d/msgid/openmod-initiative/154cb76d-6880-13af-f90c-fd949b9300d2%40posteo.de?utm_medium=email&utm_source=footer>.

Taco...@sfu.ca

unread,
Mar 20, 2023, 4:29:17 PM3/20/23
to openmod initiative
Hey Johannes,

The CI check on REUSE sounds really great - any tutorial/advice on how to set that up elsewhere?

Cheers,
Taco..

Joseph DeCarolis

unread,
Mar 20, 2023, 10:54:38 PM3/20/23
to openmod-i...@googlegroups.com
Dear Johannes and Robbie,

Thanks very much for your replies. I was not aware of the REUSE system, but it provides exactly the sort of guidance I was hoping to get. And PyPSA-EUR provides a good example to follow.

As Robbie highlights, poorly organized legacy code / data will no doubt present challenging edge cases, but it seems like the best we can do is evaluate on a case-by-case basis and apply some common sense.

Best,
Joe



--
Joseph F. DeCarolis
Professor
915 Partners Way
Department of Civil, Construction, and Environmental Engineering
North Carolina State University
Campus Box 7908
Raleigh, NC 27695-7908

Phone: 919-515-0480
Fax: 919-515-7908
E-mail: jdeca...@ncsu.edu
Twitter: @jfdecarolis

Johannes Hampp

unread,
Mar 21, 2023, 3:03:06 AM3/21/23
to Taco...@sfu.ca, openmod initiative
Hi Taco,

Here's what we use:

* pre-commit [1] for checks, formatting, etc.. pre-commit is quite
popular and is usually run locally before you comit new code in git.

* a .pre-commit-config.yaml [2] file in our repo

* The pre-commit.ci for CI, which runs the pre-commit configuration of a
public GitHub repository [3]

* The REUSE pre-commit hook [4]

So for setting it up with your own repo, you need the entry from [4]
like we have in [2]. Then register your repo with [3] and that's it.

The nice thing about it is, that you can run everything locally from the
same configuration file. Or let the CI take care of it remotely in your
public repository.

If you don't use GitHub there are a ton of tutorials on how to use
pre-commit via e.g. the GitLab CI or other CI providers. Configuration
stays the same, you only need to substitute [3] with the appropriate CI
then.

[1] https://pre-commit.com/
[2] https://github.com/PyPSA/pypsa-eur/blob/master/.pre-commit-config.yaml
[3] https://pre-commit.ci/
[4] https://github.com/fsfe/reuse-tool#run-as-pre-commit-hook



HTH,
Best,
Johannes
> --
> You received this message because you are subscribed to the Google
> Groups "openmod initiative" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to openmod-initiat...@googlegroups.com
> <mailto:openmod-initiat...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/openmod-initiative/dc140079-be91-401e-97d6-213683a72816n%40googlegroups.com <https://groups.google.com/d/msgid/openmod-initiative/dc140079-be91-401e-97d6-213683a72816n%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages