AIQC seeking contributors to bring deep learning to researchers

99 views
Skip to first unread message

Layne Sadler

unread,
Mar 22, 2021, 6:42:53 AM3/22/21
to NumFOCUS
AIQC is an open source Python framework for rapid & reproducible deep learning. It's goal is to empower open science.

The framework weaves together many NumFOCUS/PyData tools (np, pd, sklearn, jupyter) with the deep learning ecosystem (keras, tf, torch) to provide out of the box workflows that scientists can adopt. 


Now that we have a stable API with a few data types/ analyses, we would like to welcome contributors in hopes of building a community and joining NumFOCUS.



Thank you,
Layne

Jacob Barhak

unread,
Mar 22, 2021, 9:26:31 AM3/22/21
to numf...@googlegroups.com
Hi Layne,

How can you accommodate reproducible deep learning where GPU interfaces are not reproducible. Here is a link to one discussion on the topic:

A quick search can find many others.

How do you plan to approach the hardware issues?

This is an important topic and I hope you have good answers

Jacob



--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/numfocus/82a7654d-da2f-41e7-a2ee-9c2c0194e8bfn%40googlegroups.com.

Layne Sadler

unread,
Mar 22, 2021, 10:01:59 AM3/22/21
to numf...@googlegroups.com
Hey Jacob,

Thank you for raising this. For starters, I'd recommend using a persistent format that preserves dtypes and forcing the dtype of floating point precision when reading it into memory opposed to leaving it to system defaults (e.g. None, 'float', np.float32, np.float64). I don't think all systems support float64. Perhaps worker nodes were using the different versions of the same software (e.g. different dtypes supported in different numpy versions). Not all systems support float64. Also, part of the problem may be in how NaN is handled as a float so check for missing values.

However, the focus of this library is more about recording and guiding the steps of the workflow rather than solving the challenges of deterministic computation GPU. Perhaps Dask or PyTorch distributed may hold an answer for you. Ultimately, I'd say that a few decimal places will be smoothed out during training for data that has been properly encoded/scaled, and not to worry about it because the goal is to make a generalizable model.

Thanks,
Layne

You received this message because you are subscribed to a topic in the Google Groups "NumFOCUS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/numfocus/2PoF-n2OT2Q/unsubscribe.
To unsubscribe from this group and all its topics, send an email to numfocus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/numfocus/CAM_y%2B3Q7rg5uRv9U%2BFyPjmM77QFLpMkw5Ja4eB7wW-cLa_P2bw%40mail.gmail.com.

Jacob Barhak

unread,
Mar 22, 2021, 10:13:25 AM3/22/21
to numf...@googlegroups.com
So Layne,

You are indicating that there are other problems with regards to reproducibility beyond the problem I was mentioning. Did anyone map all the problems to allow a reproducible machine learning model?

It seems there are multiple problems and solving one of the steps will not make things reproducible.

Yet I applaud selecting the topic. Many do not understand the importance and the difficulty. It is important to discuss it.

         Jacob





Layne Sadler

unread,
Mar 22, 2021, 10:51:05 AM3/22/21
to numf...@googlegroups.com
Yeah, just have to pick your spots. I wouldn't prioritize something like that unless it was changing the fundamental distribution of an important column/ feature. It sounds like a good challenge for containerized workflows. Bear in mind that neural nets are non-deterministic.

Most solutions in the space are aimed at recording training and skip over data preprocessing entirely. The data either shows up as "X_train, y_train," in memory. Here are some of the data problems we have solved so far: prevent data leakage during preprocessing, preparing validation splits/folds to avoid evaluation bias, and recording which sample indices go into which fold/split.


Jacob Barhak

unread,
Mar 22, 2021, 12:11:26 PM3/22/21
to numf...@googlegroups.com
Yes Layne, 

There is a problem with current libraries that skip some information necessary for reproducibility such as splitting data into train and test with out trace back to original data. 

However, there is no reason a neural network will not have deterministic nature. It is possible if implemented correctly. Let us distinguish between random and deterministic.

Hopefully deterministic elements will be added into libraries in the future to allow reproducibility. 

        Jacob



Stefan van der Walt

unread,
Mar 22, 2021, 3:24:36 PM3/22/21
to Layne Sadler, numf...@googlegroups.com
Hi Layne,

Welcome to the list!  I would be happy to share your project with some colleagues in the field.

I like that you provide an open mechanism for provenance tracking.  From what I've seen, the offerings available are either commercial or rather limited.

In terms of your mission, part of me hopes that we will teach AI practitioners *more* about what they're doing :)  But, if I read you correctly, I think your intent is to make tools that catch mistakes, validate results, and make it more transparent where problems are slipping in—all great.

Feel free to reach out if you have any questions on growing your project and its community.

Best regards,
Stéfan

On Mon, Mar 22, 2021, at 03:42, Layne Sadler wrote:
AIQC is an open source Python framework for rapid & reproducible deep learning. It's goal is to empower open science.

The framework weaves together many NumFOCUS/PyData tools (np, pd, sklearn, jupyter) with the deep learning ecosystem (keras, tf, torch) to provide out of the box workflows that scientists can adopt. 

Jacob Barhak

unread,
Mar 22, 2021, 3:59:00 PM3/22/21
to numf...@googlegroups.com
Ok Layne,

This makes sense. You are focusing on part of the pipeline. Can you give an example on how your system is better than for example the split and fold in sci-kit learn. I suspect I understand, yet since it is a general discussion list - you should do this to the benefit of all those who read it.

Also, what licenses did you consider when releasing your tool? - this has implications in the long term. I am raising this topic for a reason since I only recently learned about Nufocus approach and I wanted to figure out your consideration. This Is more about Numfocus than about your library - I am just using you as an example. 

I will also send you an email privately about promotion of this library since it will be less interesting for the entire community.

Good luck.

               Jacob



Layne Sadler

unread,
Mar 22, 2021, 5:42:24 PM3/22/21
to NumFOCUS
Yes, precisely. For example, last night I added a rule that prevented pairing "Algorithm.analysis_type=='classification_binary'" with "Labelcoder.sklearn_preprocess==OneHotEncoder()" because it always outputs multiple columns, but sigmoid/binarycrossentripy want a single output neuron. Then steered the user toward Binarizer() instead to fix this.

However, is transfer learning a necessity for newcomers? I can't see newly coding biologist gathering the courage to take on a wall of layers, and their 1-liners are much nicer for a UI. As long as it is done transparently (click to expand layers).

Layne Sadler

unread,
Mar 22, 2021, 6:13:25 PM3/22/21
to NumFOCUS
Q: Which license, why?
A: AGPL so that a cloud provider couldn't just fork it and make a cloud service of it without at least giving me a shoutout and having no pressure to contribute back. If this doesn't jive w NumFOCUS, then I am willing to change it to something less restrictive.

Q: How different from split/fold in sklearn?
A: The simple answer is that they are persistent. The longer answer is that the Splitset and Foldset objects are part of a larger schema of relationships and rules. I weave a lot of sklearn into this framework and plan to do more with imputing and feature importance.

AIQC actually uses both `train_test_split` (twice if `size_validation:float` is not None) and `StratifiedKFold` under the hood, but only for initially dividing the samples indices. From there I record the sample indices going into each split/ fold. They can even be used together if you want to get real weird to give you 4 slices: folds_train_combined, fold_validation, validation, test.  

Like all AIQC objects stemming from a Dataset, you have granular control when it comes to fetching/ filtering the raw bytes by features/labels/both, columns, and sample indices. This enables encoding each split/fold separately to avoid data leakage from aggregate transformations. Furthermore, you can include/exclude specific dtypes and column names within those splits for encoding.

One thing I needed to solve for was that StratifiedKFold apparently doesn't handle stratification of continuous variables, so I do some manual quantiling of the data based on a user-defined `bin_count` parameter. 

All the user has to worry about (one day in a UI):
```
splitset = aiqc.Pipeline.Tabular.make(
    dataFrame_or_filePath = df 
    , label_column = 'species'
    , features_excluded = None
    , size_test = 0.24
    , size_validation = 0.12
    , fold_count = 10
    , bin_count = 3
    , label_encoder = OneHotEncoder(sparse=False)
    , feature_encoders = [{
        "sklearn_preprocess": PowerTransformer(method='box-cox', copy=False)
        , "dtypes": ['float64']
    }]
)
```

Jacob Barhak

unread,
Mar 22, 2021, 11:39:15 PM3/22/21
to numf...@googlegroups.com
So Layne,

You explain the 

If you want to give your tool complete freedom you should really avoid GPL type licenses - those are the worst. In fact, some people like me will not touch anything related to GPL these days after experiencing how problematic it is. GPL type licenses have been decreasing in the last few years.  Their copyleft infectious properties resemble more like a virus than something beneficial and I know complaints about these - typically problems occur after a long time when it is hard to detach and the price is steep. 

If you want to free your work to have a lot of impact - I would suggest Creative Commons Zero (CC0). Yet it does mean giving the work freedom to develop in ways you have no control over. This license is not copyright based - meaning it really frees the work. 

From what I understood from Katrina Riehl, CC0 license is not something that NumFocus will support as an organization since it is not OSI approved. I just recently learned about it and this is why I asked. So this discussion represents my own option - and as far as I understand it, it is not NumFocus endorsed. I would really be interested to see a comment from NumFocus about the choice of license here and in general - this would be an interesting discussion on its own. 

Hopefully you will consider switching the license.

Thanks for explaining differences from scikit-learn - it seems you stumbled on something important. 

             Jacob



Ralf Gommers

unread,
Mar 23, 2021, 3:55:44 AM3/23/21
to numf...@googlegroups.com
On Tue, Mar 23, 2021 at 4:39 AM Jacob Barhak <jacob....@gmail.com> wrote:
So Layne,

You explain the 

If you want to give your tool complete freedom you should really avoid GPL type licenses - those are the worst. In fact, some people like me will not touch anything related to GPL these days after experiencing how problematic it is. GPL type licenses have been decreasing in the last few years.  Their copyleft infectious properties resemble more like a virus than something beneficial and I know complaints about these - typically problems occur after a long time when it is hard to detach and the price is steep. 

If you want to free your work to have a lot of impact - I would suggest Creative Commons Zero (CC0). Yet it does mean giving the work freedom to develop in ways you have no control over. This license is not copyright based - meaning it really frees the work. 

From what I understood from Katrina Riehl, CC0 license is not something that NumFocus will support as an organization since it is not OSI approved. I just recently learned about it and this is why I asked. So this discussion represents my own option - and as far as I understand it, it is not NumFocus endorsed. I would really be interested to see a comment from NumFocus about the choice of license here and in general - this would be an interesting discussion on its own. 

The advice in https://choosealicense.com/ is good. Starting with "use the license preferred by the community". For Python almost all projects use BSD or MIT - for new projects you want to choose MIT. If you'd use R, then GPL is a lot more prevalent so that would be a good choice - although the tidyverse packages are all MIT.

That said, as the author it's your choice. NumFOCUS will naturally lean towards MIT since most of its projects use that and it's a great license for ensuring others can use your code, but other common OSI-approved licenses are fine too.

Cheers,
Ralf


Jacob Barhak

unread,
Mar 23, 2021, 5:05:21 AM3/23/21
to numf...@googlegroups.com
Hi Ralf,

This is becoming a license related discussion so I changed the topic name, and I have to comment.

OSI did not approve CC0 as a license and this is a problem. Even MIT and BSD, which are much better than GPL, are still copyright based licenses. And Copyright is a legal restriction tool.

There is a shift today in use of licenses and organizations like BioModels understood the difficulties and decided to move to CC0 which essentially puts things in the public domain. Check out this reference:

Rahuman S Malik-Sheriff, Mihai Glont, Tung V N Nguyen, Krishna Tiwari, Matthew G Roberts, Ashley Xavier, Manh T Vu, Jinghao Men, Matthieu Maire, Sarubini Kananathan, Emma L Fairbanks, Johannes P Meyer, Chinmay Arankalle, Thawfeek M Varusai, Vincent Knight-Schrijver, Lu Li, Corina Dueñas-Roca, Gaurhari Dass, Sarah M Keating, Young M Park, Nicola Buso, Nicolas Rodriguez, Michael Hucka, Henning Hermjakob, BioModels—15 years of sharing computational models in life science, Nucleic Acids Research, Volume 48, Issue D1, 08 January 2020, Pages D407–D415, https://doi.org/10.1093/nar/gkz1055

OSI position currently restricts things and this is part of the problem. The way you describe the state of affairs now  is a snapshot in time and if we want to improve things, some approaches to licensing should change to avoid some issues. If you want a longer discussion, please check this publication:

Jacob. Barhak, Open Source and Sustainability, COMBINE 2020 October 5-9. Video: https://drive.google.com/drive/folders/1actGnx6FwvoCcPrrF3qbnO0AmHt10WN6  starting from minute 13:10.  Presentation:  https://jacob-barhak.github.io/COMBINE2020_OpenSource_upload_2020_10_04.odp

Hopefully this will give you a better perspective. If I understand correctly, NumFocus complies with OSI and therefore restricts things. If this is correct, NumFocus loses points. I suggest this topic will be raised for open discussion in NumFocus. Other licensing organizations support CC0 as a solution while OSI does not - I suggest NumFocus will adopt the less restrictive approach and allow CC0 or switch to endorse another licensing organization that is more permissive. 

And to Layne - you should consider your user base - I practically won't touch today a library with GPL related license if I have any alternative - if you ask around, you will find that I am not the only one - there are issues with this license that should not be overlooked. MIT or BSD are much better and less restrictive, Yet I hope you will consider putting work in the public domain - this is the best approach to make your work reusable by others. 

I was amazed to learn NumFocus adopted the OSI approach.

NumFocus is doing a lot of good things, and should not be affected by restrictions of another organization.  Hopefully NumFocus will act properly on this topic.

              Jacob


Ralf Gommers

unread,
Mar 23, 2021, 5:20:40 AM3/23/21
to numf...@googlegroups.com
On Tue, Mar 23, 2021 at 10:05 AM Jacob Barhak <jacob....@gmail.com> wrote:
Hi Ralf,

This is becoming a license related discussion so I changed the topic name, and I have to comment.

OSI did not approve CC0 as a license and this is a problem. Even MIT and BSD, which are much better than GPL, are still copyright based licenses. And Copyright is a legal restriction tool.

There is a shift today in use of licenses and organizations like BioModels understood the difficulties and decided to move to CC0 which essentially puts things in the public domain. Check out this reference:

Rahuman S Malik-Sheriff, Mihai Glont, Tung V N Nguyen, Krishna Tiwari, Matthew G Roberts, Ashley Xavier, Manh T Vu, Jinghao Men, Matthieu Maire, Sarubini Kananathan, Emma L Fairbanks, Johannes P Meyer, Chinmay Arankalle, Thawfeek M Varusai, Vincent Knight-Schrijver, Lu Li, Corina Dueñas-Roca, Gaurhari Dass, Sarah M Keating, Young M Park, Nicola Buso, Nicolas Rodriguez, Michael Hucka, Henning Hermjakob, BioModels—15 years of sharing computational models in life science, Nucleic Acids Research, Volume 48, Issue D1, 08 January 2020, Pages D407–D415, https://doi.org/10.1093/nar/gkz1055

OSI position currently restricts things and this is part of the problem. The way you describe the state of affairs now  is a snapshot in time and if we want to improve things, some approaches to licensing should change to avoid some issues. If you want a longer discussion, please check this publication:

Jacob. Barhak, Open Source and Sustainability, COMBINE 2020 October 5-9. Video: https://drive.google.com/drive/folders/1actGnx6FwvoCcPrrF3qbnO0AmHt10WN6  starting from minute 13:10.  Presentation:  https://jacob-barhak.github.io/COMBINE2020_OpenSource_upload_2020_10_04.odp

Hopefully this will give you a better perspective. If I understand correctly, NumFocus complies with OSI and therefore restricts things. If this is correct, NumFocus loses points. I suggest this topic will be raised for open discussion in NumFocus. Other licensing organizations support CC0 as a solution while OSI does not - I suggest NumFocus will adopt the less restrictive approach and allow CC0 or switch to endorse another licensing organization that is more permissive. 

And to Layne - you should consider your user base - I practically won't touch today a library with GPL related license if I have any alternative - if you ask around, you will find that I am not the only one -

While I personally am happier with other licenses than GPL (because I cannot reuse code in projects I work on), there are also very valid reasons an author may want to use GPL. So telling an author first thing to change their license because "my pet reason X" is not a great response to someone sharing a new project to be honest.

there are issues with this license that should not be overlooked. MIT or BSD are much better and less restrictive, Yet I hope you will consider putting work in the public domain - this is the best approach to make your work reusable by others. 

No it's not. There are countries where you cannot waive copyright, as a quick glance at Wikipedia will tell you: https://en.wikipedia.org/wiki/Public-domain_software.


I was amazed to learn NumFocus adopted the OSI approach.

NumFocus is doing a lot of good things, and should not be affected by restrictions of another organization.  Hopefully NumFocus will act properly on this topic.

I no longer have an official role within NumFOCUS so I won't speak for the organization, but all this doesn't make too much sense. Just like GPL fanboying is tiresome, so is an extreme stance on copyrights. Please just let people use the license they prefer, they all have pros and cons.

Ralf


Andy Ray Terrel

unread,
Mar 23, 2021, 7:08:35 AM3/23/21
to numf...@googlegroups.com
The official NumFOCUS policy is that any non OSI license must be approved by the board. There are cases where our community used non OSI licenses. 

We defer to the OSI as they are a group of trained lawyers who regularly litigate open source cases and following their advice will protect our contributors and users. If a person wants to use CC0, I recommend they become familiar with the legal reasons OSI recommends against it. Their mailing lists are very interesting for folks that like law, and you can find a link to their discussions on this topic at https://opensource.org/faq#cc-zero

As to the AGPL, I think Jacob's point is that while AGPL will stop cloud providers from ripping a person off, it will also deter users. This is a trade off that every project maintainer has to make and is certainly true that most NumFOCUS projects would not use a code that is licensed AGPL. I started my career very pro GPL and eventually moved to the more permissive licenses and prefer the APLv2 these days, but I also am more likely to personally include social awareness clauses to my licenses if I released more software (even violating OSI's policies). We all have our reasons for choosing a license and it turns out to be a very important choice that is very hard to change after a project becomes popular.

-- Andy

Jacob Barhak

unread,
Mar 23, 2021, 9:58:09 AM3/23/21
to numf...@googlegroups.com
Thanks Ralf,

Which countries will not allow waiving copyright and release to the public domain? 

What I saw in your article is a link to the Berne Convention that has to do with Copyright - not public domain, and some countries did not sign it. So please be specific I could not find in your link anything about public domain not being allowed - please correct me if I missed the text. 

Also when you release under CC0  you declare which country you release it under and if you ever get contacted by someone outside that needs a specific license, you can always release the software again for that entity - once its in public domain you can pretty much copy, modify and add a license that fits the needs of that person - in fact almost anyone can do it. 

The link you sent about public domain software actually confirms my beliefs - look at the paragraph "Post-copyright public domain" it discusses what happened - this is evolution and I suspect that in a few years things will change again to adapt to market needs - GPL is problematic since it is stangant and restrictive. And I am talking from experience - I used to release under GPL - I avoid it now. 

Notice that you too have a preference to other licenses - I suspect it is not accidental.

And I agree, a license is situational - sometimes you want to restrict sometimes you want to open - it is a decision every developer should consider. However, developers also need to be informed and this discussion is helping others understand the choices and their consequences.

My problem was with NumFocus position - and I will answer Andy in a different email. 

I think Ralf, we more or less think the same and this discussion was helpful.

              Jacob









Jacob Barhak

unread,
Mar 23, 2021, 10:32:57 AM3/23/21
to numf...@googlegroups.com
Hi Andy,

The problem is the attachment to OSI dictation. I used to be a member, yet will not sponsor them anymore after I learned about their CC0 decision last year. In the past I used to communicate with people running for office there and they seemed reasonable and I had some interesting exchanges with them. However, it seems the organization changed. Perhaps I am wrong and I will be glad to be corrected. 

The discussion about CC0 in the OSI web page seems lacking since re-releasing public domain software under any other license resolves the issue they discuss. 

It is good that Nufocus has an escape clause to avoid OSI dictation. However, after this discussion started, I want to know NumFocus position on Public Domain software and specifically CC0. Can you raise it in your forum?

If you prefer certain licenses when you consider funding for example, it creates a bias and a pull in a specific direction. If you open up opportunities and make things more free, it will eventually lead to better outcomes. NumFocus should not stay stagnant. Perhaps OSI dictation was good in the past, yet it may be time to move forward. 

I must add that the information on licenses is overwhelming and many times conflicting when reading articles on the web - it does seem to be a matter for lawyers. I agree that there are good reasons to choose a license and many times those are situational, so it is important that we are having this discussion.

Hopefully as a result of this discussion NumFocus will expand its views regarding licenses. 

                Jacob






--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Jérôme Kieffer

unread,
Mar 23, 2021, 11:44:28 AM3/23/21
to numf...@googlegroups.com
On Tue, 23 Mar 2021 08:57:55 -0500
Jacob Barhak <jacob....@gmail.com> wrote:

> Which countries will not allow waiving copyright and release to the public
> domain?

Hi,

I believe France is one of them ...
There is a distinction between the author with (inalienable) rights and
the copyright owner ...
Public domain is obtained only 70 years after the death of the last
author.

Beside this, I changed the license of my python code from GPL to MIT
and it was a pain to get the acknowledgment of every single contributor.
Picking the proper one in fist instance is advised.

Cheers,

Jerome

Sylvain Corlay

unread,
Mar 23, 2021, 12:15:43 PM3/23/21
to numf...@googlegroups.com
I am not sure that NumFOCUS should be prescriptive against CC0, which is mostly used for textual content (like code snippets in documentation so as to remove as many constraints as possible), but very rarely software for which it does not really make sense.

Regarding the inalienable perpetual author rights in France (and most of Europe), these only concern "moral rights" over "patrimonial rights". The former concern the connection of authors to their work while the latter concern ownership. Moral rights are meant to reflect the connection between the work and its author. One of the main rights is that nobody can *claim* to be an (or the) author instead of the actual author(s), even if they hold the copyright, and actual authors are entitled to be associated with their work.

In short, these inalienable perpetual rights on a work do not prevent "licenses" authorizing redistribution etc.

Best,


--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Robert Kern

unread,
Mar 23, 2021, 2:07:13 PM3/23/21
to numf...@googlegroups.com
On Tue, Mar 23, 2021 at 9:58 AM Jacob Barhak <jacob....@gmail.com> wrote:
Thanks Ralf,

Which countries will not allow waiving copyright and release to the public domain? 

The United States is one of them. Before we adopted the Berne Convention, we had a statutory way to put stuff into the public domain: omit the copyright declaration, and it's in the public domain (more or less). When we adopted the Berne Convention, under which copyright adheres at the moment of creation regardless of whether it is marked with a copyright declaration or not, we did not also create a new statutory way to put works into the public domain. The US Code provides two ways for works to enter the public domain: expiration of copyright and works created by the federal US government. That's it. Nothing in US statutory law provides for private authors to voluntarily put creative works into the public domain before the expiration of copyright.
 
What I saw in your article is a link to the Berne Convention that has to do with Copyright - not public domain, and some countries did not sign it. So please be specific I could not find in your link anything about public domain not being allowed - please correct me if I missed the text. 

The point being made in the Wikipedia article about the Berne Convention isn't that public domain doesn't exist in any jurisdictions. Rather, just that its automatic copyright provisions reduced the ways that a work can enter into the public domain in many jurisdictions, just like the US. You had to do something extra above the Berne Convention to do it, and many didn't.

--
Robert Kern
Enthought

Jacob Barhak

unread,
Mar 24, 2021, 4:03:44 PM3/24/21
to numf...@googlegroups.com
Thanks Jérôme, Thanks Sylvain, Thanks Robert,

All your answers just reflect how complicated the issue is - and I assume none of you are lawyers and trained with this law and I noticed that even lawyers argue about those issues, so even if you had legal training, there are many questions. Since I don't have legal training, I can only mention things I know  Such as CC0 was created 2007-2009 after the Berne Convention in 1886  So Robert, the creators of CC0 should know about the convention at that point in time.

I am not sure about France's legal system, yet I know the USA has methods of putting things in the public domain - for example many government documents are considered public domain - so CC0 works in the USA. - in fact it is an option one can choose from when filling in a CC0 form that can be found here:


Interestingly enough, you can also find France there so I suggest the people from France check it out and if necessary complain to Creative Commons if a change is needed. 

And if someone does not like CC0, there are other licenses that place things in the public domain. I just picked one. 

My suspicion is that many people just do not want to let go of their work and make it public for everyone. and therefore there are many interpretations, opinions, and conflicts found on the subject. Some of those discrepancies serve those who want to keep some level of ownership and attachment to their work. Yet if someone truly wants to make things public and let go of ownership with respect to copyright, there should be a legal way to do it and organizations like NumFocus should respect this.  Currently organizations like OSI seem to be in the way for this happening and this was my main point. 

Note that CC0 still respects other restrictions such as patent rights, yet copyright is much stronger than a patent in many aspects so CC0 is still much better than copyright based licenses with regards to freedoms. 

I appreciate those who wanted to share information - this is indeed an interesting discussion.

             Jacob


--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Andy Ray Terrel

unread,
Mar 24, 2021, 4:13:25 PM3/24/21
to numf...@googlegroups.com
What makes you imply that NumFOCUS is not respecting people's right to choose the license for their work?

Jacob Barhak

unread,
Mar 24, 2021, 6:00:00 PM3/24/21
to numf...@googlegroups.com
Well Andy,

It is the association of NufFocus with OSI and its decisions that is the problem. Note that you use those as a base and only discuss things as an exception to the rule. This way you channel people in a certain direction. 

I stumbled across it by accident when I found out that one of NumFocus projects JOSS enforces OSI guidelines in the review process. It is supposed to be a Publication venue. 

So if I have software I wish to publish and submit it to JOSS and they reject it because my license is not OSI approved, this seems problematic after knowing OSI approach. Since NumFocus funds this operation, this kind of implies NumFocus made a decision with regards to licenses that aligns with OSI.

I wrote an email to NumFocus members I know.  Katrina wrote back to me indicating that there is a connection to OSI licenses - although she was a little bit vague, this confirmed that there is influence. And please correct me if I got some of the details wrong - this is the best I understand from the information I encountered. I made sure to file issues on JOSS github. 

This discussion on the mailing list started partially because I was curious about exactly what is going on. 

And I admit, I do have conflict of interest here since I own patents and look carefully at licenses these days. 

Hopefully this discussion already had some positive effect and allows more freedom in future selection of licenses.

          Jacob







On Wed, Mar 24, 2021 at 3:13 PM Andy Ray Terrel <andy....@gmail.com> wrote:
What makes you imply that NumFOCUS is not respecting people's right to choose the license for their work?

--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Stefan van der Walt

unread,
Mar 24, 2021, 6:42:01 PM3/24/21
to numf...@googlegroups.com
Hi Jacob,

On Wed, Mar 24, 2021, at 14:59, Jacob Barhak wrote:
It is the association of NufFocus with OSI and its decisions that is the problem. Note that you use those as a base and only discuss things as an exception to the rule. This way you channel people in a certain direction. 

I believe the intent behind NumFOCUS's guidelines is to guide towards open; OSI is a reasonable baseline, but we don't mean to discourage releasing works under more liberal terms.  In fact, I suspect many of us learned about licenses even before CC0 was officially sanctioned by the FSF to be compatible with the GPL.  I don't see any reason why we can't change the recommendation to "OSI-approved or CC0" now.

That said, simply requiring attribution, as per MIT, Apache, or BSD is an extremely low bar, and causes little difficulty in practice.  Data, on the other hand, benefits significantly more from CC0.

I stumbled across it by accident when I found out that one of NumFocus projects JOSS enforces OSI guidelines in the review process. It is supposed to be a Publication venue. 

So if I have software I wish to publish and submit it to JOSS and they reject it because my license is not OSI approved, this seems problematic after knowing OSI approach. Since NumFocus funds this operation, this kind of implies NumFocus made a decision with regards to licenses that aligns with OSI.

NumFOCUS does not prescribe to their projects how to run their business, except in cases where they are legally obliged to do so, or where practical working arrangements are required.

And I admit, I do have conflict of interest here since I own patents and look carefully at licenses these days. 

Most of the projects here are released under MIT or BSD, which have nothing to say on patents.  Patented algorithms are headaches to deal with in open source: while we could technically distribute implementations of such algorithms, we would then have to warn our users about the implications of using them.  In scikit-image, we have made the decision never to ship patent encumbered algorithms.

Best regards,
Stéfan

Jacob Barhak

unread,
Mar 24, 2021, 9:57:04 PM3/24/21
to numf...@googlegroups.com
Thanks Stefan,

Your approach is reasonable - if NumFocus plans to be more liberal than OSI and allow CC0 and such licenses - this is perfect - no need to continue this discussion if you do this. 

I agree that MIT / BSD are good compromises for a copyright based open source license - much better than GPL - I noticed that many people in this discussion actually tended to drift towards those licenses. 

As for patents - this is a different legal  restriction imposed for a limited time in limited jurisdiction - and it is orthogonal to open source licenses. 

If you use software that is used to infringe a patent - the license on it does not really matter.  The patent is infringed regardless of the license. The headache is that open source software can be copied and the infringement becomes broader without knowledge of people who use it and think they abide by the license. This is the scary thing that causes difficulties. People just do not know. 

So the community has to be educated about IP including patents and also trade secrets. The Austin python meetup last year hosted lawyers who code and one of them  D.C. Toedt discussed this issue:

D.C. wrote a free online book with a chapter that discusses IP - I am sending a link to the IP section

Hopefully this will help educate people on this list and beyond.

            Jacob




 




--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Andy Ray Terrel

unread,
Mar 25, 2021, 10:42:21 AM3/25/21
to numf...@googlegroups.com
While I'm not opposed to adding licenses beyond the list provided by the OSI, I'm not keen on CC0 being a recommended license.

NumFOCUS has a goal of creating software for scientific research. Scientific culture is built on citation and credit. While I applaud any folks who want to give all rights to their work for any use, I don't think CC0 for software sends the message that the work should be cited in source.

-- Andy

Jacob Barhak

unread,
Mar 25, 2021, 11:24:10 AM3/25/21
to numf...@googlegroups.com
Well Andy,

This is exactly the problem. You support onto some level of ownership - so there are strings attached - if you want strings attached, you have to invest in it from your own resources.

CC0 does not mean that people will not cite or give credit - it just means you do not demand it while other licenses have terms attached and worse - copyright based ownership make things such so the owner can change their mind and therefore have control over the work - this is the real trouble.

Note that after releasing CC0, you can still create modified versions and put copyright and licenses on those variations, yet everyone can do this - it is better than the other licenses you are used to where you have to ask permission from the owners. 

If you want ownership control - then I will ask why should I donate to numfocus if eventually the money goes to create some sort of restriction on reuse. It is beyond just crediting a person - it is giving them control and funding it. This becomes questionable. 

The message sent by CC0 is that if you want ownership of something, you better work for it and create something new and better than what was released - it becomes a starting point for competition and competition generates better solutions . This is better than some of the open source licenses like GPL that lock the work and make it stagnant and have real issues with reuse of obsolete software and abandoned software. Releasing to the public domain gives incentive to others to move forward, while the older open source licenses may lead to stagnation and legal issues - and you saw some of the confusion even in this discussion.

So if NumFocus wants to support innovation, you may want to include public domain licenses in its portfolio and let people decide how they want their work to evolve in the future. 

It will be interesting to see after many years how different tools evolved based on their license. The decline in GPL is evident after some years - I think the new public domain licenses open new opportunities and should be seriously considered for new projects - it will provide incentive for improvement.

              Jacob 









--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Stefan van der Walt

unread,
Mar 25, 2021, 12:36:06 PM3/25/21
to numf...@googlegroups.com
On Thu, Mar 25, 2021, at 08:23, Jacob Barhak wrote:
This is exactly the problem. You support onto some level of ownership - so there are strings attached - if you want strings attached, you have to invest in it from your own resources.

But, Jacob, you have to admit that the "strings" here are very thin. For BSD they are "if you use my work, mention it, and don't pretend that it's your own". That's closer to a decency clause than a license clause.

These strings are only prohibitive if you want to be a bad actor. And we don't need to enable bad actors at cost to our own projects.

I agree with you that citation is largely a separate matter, mostly one of culture that needs to be cultivated. 

Best regards, 
Stéfan 

Jacob Barhak

unread,
Mar 25, 2021, 1:15:04 PM3/25/21
to numf...@googlegroups.com
Well Stephan,

The point you miss is that there is still ownership the minute you put a copyright on it.

You are allowed to release the same work under different licenses once you put the copyright on.

And you put emphasis on bad actors - bad actors can also use the open source licenses to prohibit progress - I have personally experienced this.

And you will have to define a bad actor very carefully - you see, what you call a bad actor may just have different interests than yours,

If you release things under CC0 - your name is still there as the releaser - so people who reuse this work will always be pointed to the original and if their contribution is not significant, it will eventually be discovered. 

What I am suggesting is that if you want some exclusivity, you should work for it.  If you want restrictions, apply for a patent or work on advertising or keep some functionality to yourself and do not release it - perhaps keep it a trade secret is possible. Those are all legal options you can take to restrict use - use those methods if you want to restrict - do not hide behind a copyright restriction and tell everyone you are open and permissive - copyright is a legal restriction mechanism. 

And I agree that BSD / MIT are the least restrictive open source licences - yet those are still copyright based and hence have ownership aspects. 

Look, licensing is situational - it depends on what people want for their work. NumFocus that transfers money for projects  - should make products as public as possible. You can always give credit to contributors, yet do not give them ownership if you channel money to them with intention for public use. 

I hope I am clearer now.

          Jacob




--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Robert Kern

unread,
Mar 25, 2021, 3:24:54 PM3/25/21
to numf...@googlegroups.com
On Wed, Mar 24, 2021 at 4:03 PM Jacob Barhak <jacob....@gmail.com> wrote:
Thanks Jérôme, Thanks Sylvain, Thanks Robert,

All your answers just reflect how complicated the issue is - and I assume none of you are lawyers and trained with this law and I noticed that even lawyers argue about those issues, so even if you had legal training, there are many questions. Since I don't have legal training, I can only mention things I know  Such as CC0 was created 2007-2009 after the Berne Convention in 1886  So Robert, the creators of CC0 should know about the convention at that point in time.

Absolutely they did, which is why they designed the CC0 to include the fallback copyright license that takes effect in these jurisdictions where you cannot voluntarily place a copyrighted work into the public domain. CC uses a lot of cautionary phrases like "to the fullest extent allowed by law" and "no tool, not even CC0, can guarantee a complete relinquishment of all copyright and database rights in every jurisdiction" and "Please don’t take the 0 (zero) in the name “CC0” literally – no legal instrument can ever eliminate all copyright interests in a work in every jurisdiction."

 
I am not sure about France's legal system, yet I know the USA has methods of putting things in the public domain - for example many government documents are considered public domain - so CC0 works in the USA. - in fact it is an option one can choose from when filling in a CC0 form that can be found here:


To be clear, this is not a "method of putting things in the public domain". Works created by the federal government simply are in the public domain in the US. There's no choice involved. Using the CC0 on these works is just a standard way to describe the fact of it being in the public domain, and to relinquish as many rights as possible in other jurisdictions (the US federal government may still seek copyright protections in other nations; NASA has done this with its software sometimes; using the CC0 is a good way to disclaim those universally if they want it to function like public domain in other jurisdictions as well).

As a private individual in the US, I cannot put any of my creative works into the public domain in this way. I can certainly use the CC0 but the part that would be operative would be the fallback copyright license inside of it, not the public domain declaration. Applying the CC0 to my work does not actually put it into the public domain. Copyright still adheres; I've just given a very liberal license to the work.
 
Interestingly enough, you can also find France there so I suggest the people from France check it out and if necessary complain to Creative Commons if a change is needed. 

And if someone does not like CC0, there are other licenses that place things in the public domain. I just picked one. 

My suspicion is that many people just do not want to let go of their work and make it public for everyone. and therefore there are many interpretations, opinions, and conflicts found on the subject.

When I was younger, I really liked the idea of putting my public work into public domain. You can still see remnants of that:


But my wanting it to work under the legal regime in which I live and work doesn't make it so. Nowadays, I use an MIT license for my public code works, not because I am a glory-hog that wants to see my name in the credit lines of everything that uses it, but because using short, simple, standard licenses makes their life easier. The CC0 license (admirably!) is trying to do a few separate things at once. It's also trying to do this for a number of jurisdictions that don't even share common concepts of the public domain and don't have any unifying set of agreements like the Berne Convention. That makes it large and sometimes complicated to figure out exactly what rights apply and don't. I came to recognize that I was not doing my users any favors by using the standardized CC0 for code. The MIT license can be shorter and on surer legal footing across the world because it can rely on more portable legal concepts. The burdens of terms of the MIT license are often lighter than the uncertainty imposed by the complicated nature of CC0.

No one is asking my opinion (and I'm certainly not speaking for NumFOCUS), but if someone did, I would recommend that those inclined to use the CC0 for code to use the MIT or BSD license instead. For graphics assets and datasets (especially datasets! Not copyrightable in the US but often are in the EU!), the CC0 is great; I endorse that use wholeheartedly. As for NumFOCUS's policies, I would encourage them to not reject CC0 projects, but I also think that not listing it in a list of preferred licenses is quite reasonable.

--
Robert Kern
Enthought

Stefan van der Walt

unread,
Mar 25, 2021, 3:46:12 PM3/25/21
to numf...@googlegroups.com
On Thu, Mar 25, 2021, at 12:24, Robert Kern wrote:
No one is asking my opinion (and I'm certainly not speaking for NumFOCUS), but if someone did, I would recommend that those inclined to use the CC0 for code to use the MIT or BSD license instead. For graphics assets and datasets (especially datasets! Not copyrightable in the US but often are in the EU!), the CC0 is great; I endorse that use wholeheartedly. As for NumFOCUS's policies, I would encourage them to not reject CC0 projects, but I also think that not listing it in a list of preferred licenses is quite reasonable.

FWIW, I am fully aligned with Robert's point of view.

Best regards,
Stéfan

Jacob Barhak

unread,
Mar 25, 2021, 5:12:41 PM3/25/21
to numf...@googlegroups.com
Well Robert, And Stefan,

Your arguments against CC0 need to get context - and I am not a lawyer. yet let me try. 

The cautionary language you see actually represents reality. The reality is that different Jurisdictions have different laws - Even copyright is different in different locations. So CC0 takes this into account and actually explains it - this is not true about BSD / MIT. Since CC0 is more explanatory - it is superior. 

Your argument about universality suggests copyright is not waived - which is odd, because CC0 claims to do exactly that. And recall that after you waived copyright, anyone can create a new version - it is better than the ownership model of a copyright with regards to freedom that requires permission from the owner. If I extend your argument to the claim that CC0 is either impossible or even illegal , I would not buy that extended argument. So unless you are a lawyer who deals with those matters I suggest we stay in the level of our best understanding .

As for your argument about simplicity. CC0 Zero text is even simpler that BSD / MIT. Here is the text you add to work you publish to make it hold:

"To the extent possible under law, ??author?? has waived all copyright and related or neighboring rights to ??name of work?? This work is published from: ??location??"

Much shorter and explainable that BSD / MIT. So your argument there does not really hold. In fact, I can counter argue that BSD/MIT misleads people thinking that they are allowed to do thigs not allowed by law and therefore those are more dangerous. 

So again, those licenses are situational - people who want to publish a certain way, should not be prohibited to do so by a policy in NumFocus / OSI or anotehr entity unless you have a specific agenda in which case you should state your agenda clearly. 

One nice thing about this argument is that it seems there is no one recommending GPL anymore - this is a step in the right direction. Again, I think MIT/BSD are nice compromises, yet it is better that developers be educated on licenses and this is perhaps the best thing about this discussion.

Thank you for contributing.

          Jacob





--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Robert Kern

unread,
Mar 25, 2021, 5:38:46 PM3/25/21
to numf...@googlegroups.com
On Thu, Mar 25, 2021 at 5:12 PM Jacob Barhak <jacob....@gmail.com> wrote:
Well Robert, And Stefan,

Your arguments against CC0 need to get context - and I am not a lawyer. yet let me try. 

The cautionary language you see actually represents reality. The reality is that different Jurisdictions have different laws - Even copyright is different in different locations. So CC0 takes this into account and actually explains it - this is not true about BSD / MIT. Since CC0 is more explanatory - it is superior. 

Your argument about universality suggests copyright is not waived - which is odd, because CC0 claims to do exactly that. And recall that after you waived copyright, anyone can create a new version - it is better than the ownership model of a copyright with regards to freedom that requires permission from the owner. If I extend your argument to the claim that CC0 is either impossible or even illegal ,

Please do not put words in my mouth. I have never said such a thing.

To be clear, I really like the CC0! If you really want to release your work into something that closely resembles our ideal notions about the public domain, the CC0 is far and away the best instrument to do so. I wholeheartedly recommend that people use the CC0 instead of crafting your own public domain dedication like I once did. And for some works, especially datasets, it's really the most appropriate thing to use.

But we should be clear about what it does and does not do. In many jurisdictions, like my own USA, it does not actually put the work into the public domain. It gives a very liberal license that, in its own words, "To the greatest extent permitted by, but not in contravention of, applicable law", closely replicates the conditions of works actually in the public domain through copyright expiration.
 
I would not buy that extended argument. So unless you are a lawyer who deals with those matters I suggest we stay in the level of our best understanding .

As for your argument about simplicity. CC0 Zero text is even simpler that BSD / MIT. Here is the text you add to work you publish to make it hold:

"To the extent possible under law, ??author?? has waived all copyright and related or neighboring rights to ??name of work?? This work is published from: ??location??"

Much shorter and explainable that BSD / MIT. So your argument there does not really hold. In fact, I can counter argue that BSD/MIT misleads people thinking that they are allowed to do thigs not allowed by law and therefore those are more dangerous. 

No, that's just the text you can place on your code to indicate that you are releasing it under the CC0. That's just a notification, not the actual license. This is the CC0 text:

 
That's what your user has to read and understand (in addition to local law) to understand what they can and cannot do with your work. You'll notice that it has a lot of conditionals and references to "applicable law". You'll also notice that it explicitly talks about the remaining Copyright that the Affirmer may still retain due to the applicable law regardless of the attempt to waive and abandon it by applying the CC0.

So again, those licenses are situational - people who want to publish a certain way, should not be prohibited to do so by a policy in NumFocus / OSI or anotehr entity unless you have a specific agenda in which case you should state your agenda clearly. 

AFAICT, NumFOCUS and OSI have been pretty clear on their agendas with respect to their non-endorsements (not prohibitions) of CC0. They are not the agendas that you are projecting onto them.
 
--
Robert Kern
Enthought

Jacob Barhak

unread,
Mar 25, 2021, 5:53:33 PM3/25/21
to numf...@googlegroups.com
Thanks Robert,

You clarify things and I apologize for extending your argument to an extreme limit - yet this is important to explore those issues in the discussion.

Our views are not that different and it seems we are both playing devil's advocates to make some points clearer for the readers. 

My claim about CC0 describing the legal landscape stands - this is as public domain as you can get and it is a waiver of copyright.

As for OSI - non endorsement of CC0 - this means rejection if you use the term "OSI approved" - which is what people use many times - this part of my complaint. Too many nice words channel people in a certain direction which becomes a problem, especially if this is connected to channelling of funds. 

In this discussion we explored enough so that NumFocus will officially consider CC0 and similar licenses as approved licenses for its funded projects. 

I expect a public declaration on this - currently Stefan's wording seems a bit obscure - somewhat similar to OSI non endorsement.  You either say Yes or No to CC0. 

I hope I am clearer with my intentions for calling for action now. 

              Jacob 

--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Robert Kern

unread,
Mar 25, 2021, 7:33:59 PM3/25/21
to numf...@googlegroups.com
On Thu, Mar 25, 2021 at 5:53 PM Jacob Barhak <jacob....@gmail.com> wrote:
Thanks Robert,

You clarify things and I apologize for extending your argument to an extreme limit - yet this is important to explore those issues in the discussion.

Our views are not that different and it seems we are both playing devil's advocates to make some points clearer for the readers. 

I am not. I have been entirely sincere. I choose my words very carefully. Again, please do not speak for my intentions. You may wish to consider how much this kind of projection has been the source of manufactured disagreement in this thread.

My claim about CC0 describing the legal landscape stands - this is as public domain as you can get and it is a waiver of copyright.

This is roughly correct, but the uncertainty about how much copyright actually gets waived by using the CC0 is a key factor that I think your position is underappreciating.

As for OSI - non endorsement of CC0 - this means rejection if you use the term "OSI approved" - which is what people use many times - this part of my complaint. Too many nice words channel people in a certain direction which becomes a problem, especially if this is connected to channelling of funds. 

NumFOCUS has a mission, so it definitely will channel projects it supports in particular directions as they interpret their mission. You are free to want them to channel projects differently (and you have advocated for such channeling here in various formulations). But channel they will.
 
In this discussion we explored enough so that NumFocus will officially consider CC0 and similar licenses as approved licenses for its funded projects. 

Well, that's for NumFOCUS folks to decide, I think. I doubt that our explorations here have been especially informative on this point.
 
I expect a public declaration on this - currently Stefan's wording seems a bit obscure - somewhat similar to OSI non endorsement.  You either say Yes or No to CC0. 

The status quo (Andy: "The official NumFOCUS policy is that any non OSI license must be approved by the board") seems very clear and workable to me, even though it is not the binary that you are demanding.

--
Robert Kern
Enthought

Jacob Barhak

unread,
Mar 25, 2021, 8:39:47 PM3/25/21
to numf...@googlegroups.com
So Robert,

Your choice of words imply a position. By now it is pretty well understood - to me it seems you prefer the ownership model governed by copyright since you cast doubt on the term public domain and do not cast doubt on the simplified view of reality projected by some open source licenses. The doubt casting is what I am zeroing in. 

If NumFocus will not change position on the licensing topic,  then this discussion should end here and pass to other forums. Since we explored this topic enough to make a decision.on the matter.

If NumFocus mission implies certain choices of licenses partially dictated by OSI - people should know about this - the channeling of funds a certain way with certain considerations has implications  that people will comprehend once things are explained. 

Hopefully this discussion provides a starting to better comprehend what is going on. I am not sure we can add much more at thai point - I think it is time for action and hopefully it will be taken. 

               Jacob





--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Robert Kern

unread,
Mar 25, 2021, 10:24:04 PM3/25/21
to numf...@googlegroups.com
On Thu, Mar 25, 2021 at 8:39 PM Jacob Barhak <jacob....@gmail.com> wrote:
So Robert,

Your choice of words imply a position. By now it is pretty well understood - to me it seems you prefer the ownership model governed by copyright since you cast doubt on the term public domain and do not cast doubt on the simplified view of reality projected by some open source licenses. The doubt casting is what I am zeroing in. 

I repeat: "do not speak for my intentions. You may wish to consider how much this kind of projection has been the source of manufactured disagreement in this thread." Your reduction of other's views to strawmen is not doing your argument a service.
 
If NumFocus will not change position on the licensing topic,  then this discussion should end here and pass to other forums. Since we explored this topic enough to make a decision.on the matter.

If NumFocus mission implies certain choices of licenses partially dictated by OSI - people should know about this - the channeling of funds a certain way with certain considerations has implications  that people will comprehend once things are explained. 

It's on their website (scroll down to "Be open."): https://numfocus.org/projects-overview

Hopefully this discussion provides a starting to better comprehend what is going on. I am not sure we can add much more at thai point - I think it is time for action and hopefully it will be taken. 

But yes, I agree that this discussion is not helping NumFOCUS make any decisions, and so it should end. I recommend being prepared for the action to be "maintain the status quo".

--
Robert Kern
Enthought

Jacob Barhak

unread,
Mar 26, 2021, 7:58:02 AM3/26/21
to numf...@googlegroups.com
So Robert,

Your intentions are interpreted from your responses. You seek flaws in some of my arguments and I show flaws in yours - this is part of the debate to clarify the issue, And indeed this debate has exposed many details.

And since you added new information an pointed to the NumFocus mission you pointed to in your link - allow me to highlight two elements:
1. Ownership - it seems NumFocus assigns ownership to the project and not to individual contributors - this implies copyright based license where the project gets the copyright - correct me if I am wrong here. However, at least one project has a BDFL, which implies one person has control of the project for life. 
2. Under "Be open" - NumFocus uses the term OSI approved license - which translated to no CC0 - so this discussion is highly relevant. 

Although you highlighted licenses other than CC0, you mentioned you "liked" CC0 for some use cases  - what I am asking is that it will be added to the definition of open - you cannot be more open than waiving copyright and giving permission to the public.

Yes, this implies open up things to competition and reduce the impact of ownership, yet in the long run this may lead to a much better outcome. It will also deal with issues such as abandoned or obsolete projects that may have useful code without restriction. NumFocus should think farther than the immediate future and adapt its policies to the changing environment. 

               Jacob


--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Ralf Gommers

unread,
Mar 26, 2021, 10:07:15 AM3/26/21
to numf...@googlegroups.com
On Fri, Mar 26, 2021 at 12:58 PM Jacob Barhak <jacob....@gmail.com> wrote:
So Robert,

Your intentions are interpreted from your responses. You seek flaws in some of my arguments and I show flaws in yours - this is part of the debate to clarify the issue, And indeed this debate has exposed many details.

And since you added new information an pointed to the NumFocus mission you pointed to in your link - allow me to highlight two elements:
1. Ownership - it seems NumFocus assigns ownership to the project and not to individual contributors - this implies copyright based license where the project gets the copyright - correct me if I am wrong here.

Yes, you are wrong. This is absolutely not how things work. The author of code has copyright. And NumFOCUS doesn't own anything, so cannot "assign" anything either.

However, at least one project has a BDFL, which implies one person has control of the project for life. 
2. Under "Be open" - NumFocus uses the term OSI approved license - which translated to no CC0 - so this discussion is highly relevant. 

Although you highlighted licenses other than CC0, you mentioned you "liked" CC0 for some use cases  - what I am asking is that it will be added to the definition of open - you cannot be more open than waiving copyright and giving permission to the public.

Yes, this implies open up things to competition and reduce the impact of ownership, yet in the long run this may lead to a much better outcome. It will also deal with issues such as abandoned or obsolete projects that may have useful code without restriction. NumFocus should think farther than the immediate future and adapt its policies to the changing environment. 

This is very much uninteresting, and you have a very poor grasp of all of this. You hijacked someone's announcement of a new project, ascribe bad intentions to NumFOCUS, speak for someone else's intentions, and admit taking a "devil's advocate" position (which in general is an awful way of arguing - and this list is not for arguing to begin with).

At this point you're just wasting the time of people reading this thread, and are annoying a number of people who have patiently explained both NumFOCUS's position and copyright law.

I second Robert's recommendation: "this discussion is not helping NumFOCUS make any decisions, and so it should end". Please stop.

Ralf

Andy Ray Terrel

unread,
Mar 26, 2021, 10:22:36 AM3/26/21
to numf...@googlegroups.com
Jacob,

Let's take this off list. I'm happy to answer any questions but perhaps a video conference would allow us to communicate better.

-- Andy

Layne Sadler

unread,
Mar 30, 2021, 9:37:31 AM3/30/21
to NumFOCUS
Thanks guys. After talking to a few people and looking at other projects I ended up running with the BSD license. Explained my reasoning on a new Community page. Any eyes/ tips on this page would be appreciated. 

Layne

Jacob Barhak

unread,
Mar 30, 2021, 3:33:17 PM3/30/21
to numf...@googlegroups.com
Great Layne,

BSD is much better than GPL variants. Yet always remember - if you are the copyright holder - or if all contributors agree - you can switch a license to adapt with time - things change.

I talked to Andy on Friday and after the discussion he mentioned he will raise the topic of expanding licenses beyond OSI to other licenses - he mentioned CC licenses.

He promised to report to this mailing list, yet I guess he has not found the time to do it yet - perhaps we will get a report later.

After that talk I also sent him an email asking to look at NumFocus regulations in case of disputes / abandonment of the code in a  NumFocus project.  Hopefully they will discuss this as well at the board. Those things happen and NumFocus better be prepared.

Anyway, Layne, you should read the 4th clause of BSD again - perhaps there is a misunderstanding. The 4th clause of  the BSD 4 clause license is non endorsement - so as not to be affiliated with misuse - I am unclear why you mention giving you credit - the 4th clause is about no affiliation in case of trouble - I may have misunderstood something - clarification will help. 

Meanwhile I am glad this discussion actually made a change - good luck.

           Jacob

 

On Tue, Mar 30, 2021 at 8:37 AM Layne Sadler <layne....@gmail.com> wrote:
Thanks guys. After talking to a few people and looking at other projects I ended up running with the BSD license. Explained my reasoning on a new Community page. Any eyes/ tips on this page would be appreciated. 

Layne

--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Stephan Hoyer

unread,
Mar 30, 2021, 5:29:07 PM3/30/21
to numf...@googlegroups.com
For whatever the merits, I think it's worth noting that at least some external organizations view CC0 as problematic. For example, this is the case at Google (where I work), which uses and supports lots of open source software and probably has as much legal expertise around OSS licensing as anyone else.

For example:
- We can't contribute patches to CC0 projects: https://opensource.google/docs/patching/#forbidden
- Google Summer of Code only supports releasing code with an OSI approved license: https://summerofcode.withgoogle.com/rules/

My two cents is that it would not be well-advised for NumFOCUS projects to exclude contributions and support from organizations like Google. But obviously this is a matter of individual choice.

Jarrod Millman

unread,
Mar 30, 2021, 5:39:48 PM3/30/21
to numf...@googlegroups.com
Hi Layne,

It is unusual to use the old 4-clause BSD license. I would recommend
using the much more common 3-clause (or 2-clause) BSD license. The
3-clause one is the most popular for open source scientific Python
projects. I personally avoid using software with a 4-clause BSD
license and I suspect that others would as well.

Best regards,
Jarrod
> To view this discussion on the web visit https://groups.google.com/d/msgid/numfocus/CAM_y%2B3Q-UcFgV3YrSAUyJ7UHyVdtQCsjObWkVx-XQL-_A-oQaQ%40mail.gmail.com.

Jacob Barhak

unread,
Mar 31, 2021, 11:41:58 AM3/31/21
to numf...@googlegroups.com
Well Stephan,

You surprise me.

I assume an organization like Google will review any code being used and not only CC0 related code - please correct me if I am wrong. And you may want to check within Google with the entity that deals with Intellectual Property - you see Google does own a few patents and some licenses address those specifically. CC0 does that in a specific manner, other licenses do it another way, and some ignore those. All this has to do with intellectual property Google is associated with and it is internal to Google - another entity may have other regulations. A review process is worthwhile regardless of license you use - I assume this happens anyway.  So I see the links you send me as specific comments that compose a much larger checklist that is internal to Google. 

As for the specific links you sent - you may be still able to contribute to CC0 projects through IARC - https://opensource.google/docs/iarc/
It just means you need to review what is going on - which is good practice anyway - I suggest everyone does this for each code they include. 

And Google Summer of Code should consider including CC0 as a license they use as well and extend the variety of licenses - why restrict to what OSI approves? 

More importantly, what we discussed with Andy was extending options for licenses, not excluding licenses. So if I understand your intentions, there is nothing to fear from NumFocus extending their definition - the definition of "open" will be extended - not reduced. And you can still contribute through Google using the IARC program if approved - they claim to historically approve the "vast majority" of cases anyway. 

I think adding variety is a good solution allowing people to choose.

             Jacob


Andy Ray Terrel

unread,
Apr 14, 2021, 1:21:17 PM4/14/21
to numf...@googlegroups.com
Hi all,

I took Jacob and your comments to the board meeting last week. Since so many companies do not recognize CC0 as an acceptable default license, the NumFOCUS board does not wish to make it one of the defaults for our fiscal sponsorship. Thus the current policy stands. Any project may argue that CC0 is the right license for their community when applying for fiscal sponsorship but will require board approval to be accepted.

Thank you for all the discussion.

-- Andy

-- 

Andy R. Terrel, PhD

President, NumFOCUS Board

Jacob Barhak

unread,
Apr 14, 2021, 3:40:31 PM4/14/21
to numf...@googlegroups.com
So Andy,

Please be very specific. You really should be. I want to know exactly what were the arguments against with numbers and names of companies and if possible with transcripts per board member. 

The Google example brought in this email list is manageable as I pointed out to the author . Please provide additional details on more companies that will not recognize CC0 to back your decision - it might be just like the Google situation where people do not read the fine print. 

Also, it is important to know what companies NumFocus aligns itself with - because there are many opinions. If you use a narrow definition of open, it is important to know what you really mean and who does NumFocus serve.

Please note that there is a growing number of entities that use only CC0 so you are caught in a decision of banning one or the other. 

In our discussion you seemed open to opening the definition of open so developers can choose. By not allowing new open licenses you chose to reduce the option given to developers to choose from.- so you actually narrow down the options. 

You also probably did not address issues such as developer conflict or software abandonment - if you consider those, you will see that CC0 if far more superior as it resolves it implicitly. 

I await full details on your decision and plan to continue pushing this issue. How about an open debate on the topic? Hopefully you will be open to this.

                Jacob



--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Stephan Hoyer

unread,
Apr 14, 2021, 4:29:33 PM4/14/21
to numf...@googlegroups.com
On Wed, Apr 14, 2021 at 12:40 PM Jacob Barhak <jacob....@gmail.com> wrote:
The Google example brought in this email list is manageable as I pointed out to the author . Please provide additional details on more companies that will not recognize CC0 to back your decision - it might be just like the Google situation where people do not read the fine print. 

As somebody who has to follow Google's policies, from my perspective they are quite restrictive about how we can use and contribute to CC0 code. Some use-cases are entirely prohibited, and most others require specific legal approval. In contrast, using and contributing to projects with OSI licenses as part of my work or on my own time is in most cases pre-approved. In my opinion, "difficult" would be a better summary than "manageable" for working with CC0 code at Google.

Andy -- thanks for bringing this to the NumFOCUS board. In my opinion, this is a reasonable resolution, and it probably would not be productive to continue this debate further at this time.

Stefan van der Walt

unread,
Apr 14, 2021, 4:35:37 PM4/14/21
to Jacob Barhak, numf...@googlegroups.com
Hi Jacob,

On Wed, Apr 14, 2021, at 12:40, Jacob Barhak wrote:
In our discussion you seemed open to opening the definition of open so developers can choose. By not allowing new open licenses you chose to reduce the option given to developers to choose from.- so you actually narrow down the options. 

A correction: Andy said the board would have to approve CC0 for any given application, not that it would not be allowed.  One reason for this is the inherent complexity of CC0 (as indicated by this thread, and the fact that not even OSI could reach a firm conclusion on it).

Anyone who wants to use CC0 should apply through the standard process, and we'll take the conversation from there—once it happens.

Best regards,
Stéfan

Jacob Barhak

unread,
Apr 14, 2021, 5:37:15 PM4/14/21
to Stefan van der Walt, numf...@googlegroups.com
Well Stefan,

When Andy and I spoke he mentioned he will raise the entire CC corpus of licenses as an alternative that will be offered - CC0 being one of them.

It seems the argument focuses on CC0 and I understand why - it really opens up the definition of open and allows doing extraordinary things that OSI did not approve of.

OSI could not reach a decision on CC0, yet other open source organizations did - if NumFocus declares only OSI approved licenses as open, this means they ignore the decision of other organizations and make it difficult to other licenses by default. 
And NumFocus did mention OSI approved as a criteria here:

Although there is text below that disclaims it, your own reply focusing on OSI decision and this entire discussion points out that this is the base for your perspective on openness. This is my problem. 

Is CC0 not considered open in your mind or in the mind of the board? Are other CC licenses that are different not eligible to be called open?

And then, did you ever consider what happens to projects in the far future? And I am not talking about diseased developers or those that cannot be located - I am speaking about groups that fall apart or software that is abandoned - if you use OSI approved license it locks the project under the multi contribute structure NumFocus has. If you use CC0 as a solution if this happens it gives the code life again and provides incentive for development. So there are many benefits that NumFocus may have not discussed. 

Look at two organizations that provide now content under CC0:

I recently learned that Covasim is released under another CC license:

Will data and models stored there not be accessible to larger organizations like Google because of the ultra permissive CC0 license? Will other CC licenses not be appropriate?

CC0 is relatively new compared to other open source licenses so it will take more time to catch to reach numbers as existing licenses, yet there is a growing number of entities using it so making no decision about its openness in by NumFocus based on the decision of some company makes things stagnant and not moving things forward and denies options from developers.

It also serves as a channelling mechanism to the community that points in a certain direction. This is why I want to learn the list of the companies that will not allow CC0 - and we already concluded the Google policy does not prohibit it. 

Stefan, copying other entities is not always the right thing to do - why are you copying X and not Y? This implies something about you. I want NumFocus to think ahead - not stay stagnant and copy - I want NumFocus board to think and discuss the topic properly - from the laconic answer Andy broadcasted I conclude the full discussion did not happen. 

Having this discussion open will educate people more about what NumFocus is really doing and what it stands for. 

Hopefully my argument is clearer now. 

              Jacob


















Hilmar Lapp

unread,
Apr 14, 2021, 6:04:49 PM4/14/21
to numf...@googlegroups.com, jacob....@gmail.com


On Apr 14, 2021, at 5:37 PM, Jacob Barhak <jacob....@gmail.com> wrote:

Look at two organizations that provide now content under CC0:

Dryad has always required data deposits to be released under CC0. Which is arguably the right policy for data but far from necessarily so for software. Indeed, this presented a barrier to getting software underlying a publication deposited and archived too, because there was no mechanism in Dryad for exempting the software deposit from the CC0 release. This has recently changed: in part to address this, Dryad is now partnering with Zenodo to allow pushing software items through to Zenodo for permanent archival, and (quoting from linked article) "Researchers will also have the opportunity to select the proper license for their software, as opposed to Dryad’s CC0 license.”

I’ll add that I write this solely with the intent to clarify about something of which I have relatively direct knowledge. I do otherwise feel that this thread has received more than its fair share of airtime and oxygen, and I did not mean to extend its lifetime. Please keep this in mind if with this I choose to disengage.

 -hilmar

-- 
Hilmar Lapp -:- lappland.io



Jacob Barhak

unread,
Apr 27, 2021, 11:57:38 AM4/27/21
to numf...@googlegroups.com
Well Andy,

Can you elaborate on your decision - I am still waiting for details.

Do you have any recordings of the NumFocus board meeting?

I am curious what "so many companies" means. Can you please name all the companies that were discussed in the meeting? 

I really want to know what is behind it all - I understand Google - if I were them I may act the same - their policy protects some of their interests and IP the way they operate and notice that Google does not have a strong ban - it is just that people do not read the fine print. Yet I want to know the full list of companies you discussed - hopefully all those names were recorded in meeting minutes you can share.

            Jacob

Andy Ray Terrel

unread,
Apr 27, 2021, 1:30:19 PM4/27/21
to numf...@googlegroups.com
Responding off list as multiple members of the community have asked to close this thread.

--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Jacob Barhak

unread,
Apr 27, 2021, 2:11:59 PM4/27/21
to numf...@googlegroups.com
Sorry Andy,

A discussion does not end because someone asks it to end - the discussion is resolved when everything is properly revealed. 

In this day and age, people are quite good at ignoring topics that are not interesting for them. And one does not need to respond if they do not like the topic - these are choices we make. 

This topic lies at the core of what NuFocus stands for - how do you define "open"

If you intend to resolve the discussion  do it openly and transparently so everyone knows everything - it is the proper way to handle this.

              Jacob


Andy Ray Terrel

unread,
Apr 27, 2021, 2:20:08 PM4/27/21
to numf...@googlegroups.com
Jacob,

I respectfully disagree. If you would like to have a discussion off list I'm happy to do so but I won't be replying to this thread anymore.

-- Andy

Matthew Rocklin

unread,
Apr 27, 2021, 2:23:21 PM4/27/21
to numf...@googlegroups.com
I recommend that this conversation progress to a subcommittee of those who care strongly about the topic.  And then once that conversation is resolved a representative of that committee can update the rest of us on this thread.  

That might balance the desires for an open process while also allowing the broader set of us to opt-out of this conversation if we choose.

--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.

Pete[r] Landwehr

unread,
Apr 27, 2021, 2:26:13 PM4/27/21
to numf...@googlegroups.com
Jacob, with respect you are the only other person participating in this conversation. Two people having a conversation about a topic raised by one of them is not a community discussion, regardless of the topic’s import. If you’d like to re-post the results of your off-list conversation, that would presumably satisfy everyone and assuage the needs for disclosure.

Best,

pml 

Jacob Barhak

unread,
Apr 27, 2021, 2:40:29 PM4/27/21
to numf...@googlegroups.com
So Matt,

Committee may not be needed if the decision is proper. . 

What I am asking now is just the details behind the decision - those should be posted regularly anyway in an organization like this.

Also, I am curious about what companies have influence on NumFocus - this should also be public. 

Andy just needs to release the link to the meeting minutes - publicly. 

It will also give us a glimpse of what happens behind the scenes - this transparency is important. 

If later there is a need for correction, a committee may resolve this - although it should also make results public.

                Jacob

 

Andy Ray Terrel

unread,
Apr 28, 2021, 9:30:29 AM4/28/21
to numf...@googlegroups.com
Okay one more response. I was maybe too hungry in my last response.

My last statement from the board meeting was imprecise. 

On April 6, the board discussed adding Creative Commons licenses to the NumFOCUS default accepted license. The proposal was rejected by a majority vote. The notes will be available in coming weeks.


I like the idea of splitting this conversation to a different open place for those interested. To this end, I have created the list lice...@numfocus.org. Please feel free to join that list at: https://groups.google.com/a/numfocus.org/g/licensing/

If we would like to propose a subcommittee on licensing, here is a charter template. Perhaps that could be a good first topic over at the new group.

-- Andy

Jacob Barhak

unread,
Apr 28, 2021, 10:40:58 AM4/28/21
to lice...@numfocus.org
Ok Andy,

This public email list is a good solution. I signed up for the new group and redirected the discussion there by BCCing the main group.

When you have the meeting minutes available, please post them there so they can be viewed publicly to start the discussion.

I am not sure about the committee charter rules - I am more interested in figuring out what is really going on. 

           Jacob


--
You received this message because you are subscribed to the Google Groups "NumFOCUS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to numfocus+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages