Julia and the Tower of Babel

888 views
Skip to first unread message

Gabriel Gellner

unread,
Oct 7, 2016, 11:35:46 AM10/7/16
to julia-users

Something that I have been noticing, as I convert more of my research code over to Julia, is how the super easy to use package manager (which I love), coupled with the talent base of the Julia community seems to have a detrimental effect on the API consistency of the many “micro” packages that cover what I would consider the de-facto standard library.

What I mean is that whereas a commercial package like Matlab/Mathematica etc., being written under one large umbrella, will largely (clearly not always) choose consistent names for similar API keyword arguments, and have similar calling conventions for master function like tools (`optimize` versus `lbfgs`, etc), which I am starting to realize is one of the great selling points of these packages as an end user. I can usually guess what a keyword will be in Mathematica, whereas even after a year of using Julia almost exclusively I find I have to look at the documentation (or the source code depending on the documentation ...) to figure out the keyword names in many common packages.

Similarly, in my experience with open source tools, due to the complexity of the package management, we get large “batteries included” distributions that cover a lot of the standard stuff for doing science, like python’s numpy + scipy combination. Whereas in Julia the equivalent of scipy is split over many, separately developed packages (Base, Optim.jl, NLopt.jl, Roots.jl, NLsolve.jl, ODE.jl/DifferentialEquations.jl). Many of these packages are stupid awesome, but they can have dramatically different naming conventions and calling behavior, for essential equivalent behavior. Recently I noticed that tolerances, for example, are named as `atol/rtol` versus `abstol/reltol` versus `abs_tol/rel_tol`, which means is extremely easy to have a piece of scientific code that will need to use all three conventions across different calls to seemingly similar libraries.

Having brought this up I find that the community is largely sympathetic and, in general, would support a common convention, the issue I have slowly realized is that it is rarely that straightforward. In the above example the abstol/reltol versus abs_tol/rel_tol seems like an easy example of what can be tidied up, but the latter underscored name is consistent with similar naming conventions from Optim.jl for other tolerances, so that community is reluctant to change the convention. Similarly, I think there would be little interest in changing abstol/reltol to the underscored version in packages like Base, ODE.jl etc as this feels consistent with each of these code bases. Hence I have started to think that the problem is the micro-packaging. It is much easier to look for consistency within a package then across similar packages, and since Julia seems to distribute so many of the essential tools in very narrow boundaries of functionality I am not sure that this kind of naming convention will ever be able to reach something like a Scipy, or the even higher standard of commercial packages like Matlab/Mathematica. (I am sure there are many more examples like using maxiter, versus iterations for describing stopping criteria in iterative solvers ...)

Even further I have noticed that even when packages try to find consistency across packages, for example Optim.jl <-> Roots.jl <-> NLsolve.jl, when one package changes how they do things (Optim.jl moving to delegation on types for method choice) then again the consistency fractures quickly, where we now have a common divide of using either Typed dispatch keywords versus :method symbol names across the previous packages (not to mention the whole inplace versus not-inplace for function arguments …)

Do people, with more experience in scientific packages ecosystems, feel this is solvable? Or do micro distributions just lead to many, many varying degrees of API conventions that need to be learned by end users? Is this common in communities that use C++ etc? I ask as I wonder how much this kind of thing can be worried about when making small packages is so easy.

David Anthoff

unread,
Oct 7, 2016, 12:03:41 PM10/7/16
to julia...@googlegroups.com

I don’t have a solution, but I completely agree with the problem description.

 

I guess one small step would be that package authors should follow the patterns in base, if there are any.

Tom Breloff

unread,
Oct 7, 2016, 12:24:54 PM10/7/16
to julia-users
This is something that I've spent a lot of time and energy thinking and discussing, as part of both Plots and JuliaML.  I think the situation can be improved in a big way, but this is not something with a "magic solution".  It takes time, effort, and a constant desire to collaborate and design with care for the greater community.  As soon as people get lazy, it starts to get unwieldy.  So I think the "solution" is just to keep at it... keep trying to collaborate... keep trying to agree on common conventions... and always look to find common ground.  Use Base as a guide, whenever possible, and if there are different conventions in place across packages, then spend the time to agree on shared conventions.  And if people refuse to collaborate, give them crap about it.

Andreas Lobinger

unread,
Oct 7, 2016, 12:28:47 PM10/7/16
to julia-users
Hello colleague,

On Friday, October 7, 2016 at 5:35:46 PM UTC+2, Gabriel Gellner wrote:

Something that I have been noticing, as I convert more of my research code over to Julia, is how the super easy to use package manager (which I love), coupled with the talent base of the Julia community seems to have a detrimental effect on the API consistency of the many “micro” packages that cover what I would consider the de-facto standard library. ....

 well, you consider 'this' the de-facto standard library and others consider 'that' a reasonable standard library and others ...

If you see the need for standardisation of interfaces, just volunteer to write a style guide and open issues and PRs on the respective packages. All this is open source and the development process is transparent on github. For exactly that reason: collaboration.

I'm contributing to the ecosystem and it has been really a pleasure to be part of the story.

Wishing a happy day,
        Andreas

John Myles White

unread,
Oct 7, 2016, 12:49:47 PM10/7/16
to julia-users
I don't really see how you can solve this without a single dictator who controls the package ecosystem. I'm not enough of an expert in Python to say how well things work there, but the R ecosystem is vastly less organized than the Julia ecosystem. Insofar as it's getting better, it's because the community has agreed to make Hadley Wickham their benevolent dictator.

 --John

Gabriel Gellner

unread,
Oct 7, 2016, 1:41:40 PM10/7/16
to julia-users
Yeah the R system is probably the best guide, as it also has a pretty easy to use package manager ... hence so, so many packages ;) I think python works without a single BDF (for science at least) since the core packages are monolithic, so the consistency is immediately apparent, and I find programmers, as a rule, dislike inconsistent API's within a given project (while seemingly less worried across packages).

In response to Andreas and Tom, I don't mean to sound like I don't want to collaborate, rather starting this discussion between conventions in Base and Optim.jl for example made me realize that it is not clear the solution is just a matter of simple discussion, rather each group would need to sacrifice a certain level of API consistency if they used the other's convention ... and like John says, usually that kind of decision requires someone to make a command from on high, which having a loose package system doesn't always facilitate. But we shall see, maybe it doesn't matter in the long run.

I just find it stressful when I am making my own package on what is the best convention to follow ... every choice feels like a severe tradeoff (do I use reltol to be like Base, which will be less and less of a guide as packages are moved out of Base ..., or do I use rel_tol because my package will commonly be used in conjunction with Optim.jl ...).

Thanks for the response though,
something I noticed, but wasn't sure what other felt.

all the best.

jonatha...@alumni.epfl.ch

unread,
Oct 8, 2016, 4:47:07 AM10/8/16
to julia-users
Maybe an "easy" first step would be to have a page (a github repo) containing domain specific naming conventions (atol/abstol) that package
developers can look up. Even though existing packages might not adopt them, at least newly created ones would have a chance
to be more consistent. You could even do a small tool that parse your files and warn you about improper naming.

Milan Bouchet-Valat

unread,
Oct 8, 2016, 5:37:13 AM10/8/16
to julia...@googlegroups.com
Le samedi 08 octobre 2016 à 01:47 -0700, jonatha...@alumni.epfl.ch
a écrit :
Creating a web page like this sounds like a good idea.

As regards automatic checking, note that there's already Lint.jl, to
which a list of "nonstandard" names could be added, together with
recommendations.


Regards

Chris Rackauckas

unread,
Oct 8, 2016, 6:11:49 AM10/8/16
to julia-users
Create a repo where we can all bikeshed different names, agree upon some, and then standardize. I honestly don't care which conventions are chosen and will just find/replace with whatever people want, but there has to be a "whatever people want" to do that.

Traktor Toni

unread,
Oct 8, 2016, 6:39:55 AM10/8/16
to julia-users
In my opinion the solutions to this are very clear, or would be:

1. make a mandatory linter for all julia code
2. julia IDEs should offer good intellisense

Chris Rackauckas

unread,
Oct 8, 2016, 6:59:51 AM10/8/16
to julia-users
Conventions would have to be arrived at before this is possible.

Jeffrey Sarnoff

unread,
Oct 8, 2016, 8:42:05 AM10/8/16
to julia-users
I have created a new Organization on github: JuliaPraxis.
Everyone who has added to this thread will get an invitation to join, and so contribute.
I will set up the site and let you know how do include your wor(l)d views.

Anyone else is welcome to post to this thread, and I will send an invitation.

Giuseppe Ragusa

unread,
Oct 8, 2016, 9:07:33 AM10/8/16
to julia-users
it seems a good idea JuliaPraxis. I have been struggling with trying to get consistent naming and having a guide to follow may at least cut short the struggling time.

Tsur Herman

unread,
Oct 8, 2016, 10:12:22 AM10/8/16
to julia-users
I noticed this also .. and this is why I chose to "rip" some packages for some of its functionality.

From what I observed the problem is the "coolness" of the language and the highly creative level of the package writers. Just as the first post here
states the seemingly two advantages , cool language and super-creative package writers .. can some time have a "babel tower" effect.

I encountered this with respect to image processing geometry primitive manipulation etc .. the problem is: too many types!!

if something can be represented as an array with some convention for example MxN array where M is the Descriptor size and N is the number of Descriptors  .. then it is better to use and support that 
than to declare more specialized types.

At least for fast paced research and idea validation it is better. Probably for implementation and performance specialized types optimized for speed will be required..
 
 

Tom Breloff

unread,
Oct 8, 2016, 10:42:40 AM10/8/16
to julia-users
I think sometimes people go overboard with types, but types allow us to take full advantage of multiple dispatch and abstraction on another level.  For example, a diagonal matrix and a full/dense matrix are both the same thing, but if you can dispatch on them differently you can massively improve the effectiveness/performance of the underlying code without much effort.  A recent thread here was asking about how to do this effectively in Python, and... well, everyone just kinda laughed at the idea.  Types allow us flexibility that we can't have otherwise.

Stefan Karpinski

unread,
Oct 8, 2016, 10:43:12 AM10/8/16
to Julia Users
Good generic API design is one of the hardest problems around. For many problem areas, we just haven't found the right design yet. JuMP is one of the prime examples of brilliant work in this area. Mathematica is the best example of consistent APIs in a language and it's ecosystem because Stephen Wolfram literally reviews and approves every single function that's added. We can't do that since this is an open source community and we don't have dictators, and honestly no one has the time or breadth of expertise to do this for all the amazing areas people are using Julia in. There are some things that are helpful, however.

GitHub orgs. Having related packages under a single org is weirdly effective – way more than it seems like it should be. I think this is about awareness and communication. Not a panacea, but more helpful than you would imagine.

Communication. Long hard conversations like this one. Get people talking about what the common API should look like. Once people agree on a good one, implementation is often easier than one might think.

Generic functions. Julia's multiple dispatch is good at this, especially because it allows you to disentangle nouns and verbs, and different people can work on different parts of the vocabulary. Have a good set of nouns like Distributions? Anyone can add their own verbs. Have some good consistent verbs? Making them apply to your own nouns is no problem either. See the esoteric-seeming expression problem [1,2,3] – which doesn't even occur to Julia programmers as being a problem because the solution is so natural.

Persistence. The more speculative and active a research area is, the less likely we are to have a consensus on what the generic interfaces and APIs should look like. Optimization APIs were all over the place until things like JuMP and Convex came along. Now you can swap out different solvers easily and keep the expression of your problem the same. Changing deep learning backends should be just as easy, but it's certainly not – because people are still trying to figure out how what the interface between how you program and how you implement these systems is.

Summary: keep trying, communicate, create organizations, and use multiple dispatch effectively.


On Sat, Oct 8, 2016 at 10:12 AM, Tsur Herman <tsur....@gmail.com> wrote:

Michael Borregaard

unread,
Oct 9, 2016, 5:59:12 AM10/9/16
to julia-users

Great to see this brought up here, and to read the constructive and thought-provoking responses from members of the Julia community. I feel this is highly important and I have thougt a lot about it recently, as I am writing an invited guest editorial for a leading ecological journal about how transferring to julia as the lingua franca for ecological scientists may affect the way we do science and work together.

I come to this from a somewhat different angle, as the ecological community is almost 100% wedded to R – the use of R has practically exploded within the last 5 years alone. So when I came to julia I was struck by how structured the package ecosystem appears to be, yet, in spite of the micropackaging. This seems to me to be a huge advantage for collaboration, creativity and methods development, and IMHO this will in the end be a stronger argument for our community to make the transition than the speed of computation.

I think there are a number of reasons for this difference, but I also believe that a primary reason is the reliance on github for developing the package ecosystem from the bottom up, and the use of organizations. These organisations, like JuliaGeo, BioJulia etc in effect act like standard package distributions, both by facilitating communication within, but also by imposing a set of strict guidelines on code compatibility. Centrally, the organisations are really visible centers for where development in a given field takes place, and thus the culture encourages developers to contribute to existing packages and organisations rather than inventing new packages. In R that is not the case - instead most scientific packages are one-lab projects developed to serve a certain research program.

I do hope that this can continue in the future, but one might worry: right now most julia developers are driven by a desire to help build the language itself, but when it grows over a certain size and becomes established this is sure to become less pronounced. Also, the current practice of software papers in scientific journals means that researchers get credit for developing new packages, but none for contributing to existing packages. This directly counteracts the best interests of the community.

It is a new situation to have a scientific language that is built openly and communally, yet with such a high degree of integration and communication. The solution must be culture, as written by Stefan and Tom, specifically to develop the community culture to keep communicating, discussing and agreeing upon standards. Also, for instance, organizations like the biojulia community are very good at identifying new ad-hoc packages coming out that relate to their work, and invite developers to join the communal effort and build the foundation of Julia in their field instead of creating lots of partial alternatives. I think this is key.

But perhaps this could be strengthened by being more explicit about building modular 'standard libraries', like in the respective organizations, but perhaps also for base (or statistics/numerical analysis, at least) that impose strict internal guidelines for conformance? These organizations, of course, would need mechanisms for ensuring renewal within the basic ideoms, so development does not die.

I for one will follow this development with keen interest.
Message has been deleted

Jeffrey Sarnoff

unread,
Oct 9, 2016, 8:22:27 PM10/9/16
to julia-users

JuliaPraxis is on  github and gitter ... bring our praxes. 

Páll Haraldsson

unread,
Oct 13, 2016, 7:35:08 AM10/13/16
to julia-users
On Friday, October 7, 2016 at 3:35:46 PM UTC, Gabriel Gellner wrote:

`atol/rtol` versus


 

`abstol/reltol` versus `abs_tol/rel_tol`


For the latter "versus" at least (and other examples), this would be solved by style-insensitivity, as in Nimrod (or Nim) language, the only one I've heard that does this; not sure of status of it, maybe they dropped it with the name-change).

I hesitated to propose this for Julia, when I first discovered this, I'm/was conflicted; I thought this would break code, as it's a breaking change, but would in fact help(?)

This could in theory be done with a macro(?)



Style Insensitive?
https://github.com/nim-lang/Nim/issues/521
"Nimrod is a style-insensitive language. This means that it is not case-sensitive and even underscores are ignored: type is a reserved word, and so is TYPE or T_Y_P_E. The idea behind this is that this allows programmers to use their own preferred spelling style and libraries written by different programmers cannot use incompatible conventions.

Please rethink about that or at least give us an option to disable both: case insensitive and also underscore ignored

[another user]:

Also a consistent style for code bases is VASTLY overrated, in fact I almost never had the luxury of it and yet it was never a problem."


Trivia on Nim[rod], D and upcoming(?) C++ below, I was just looking up hard to find above info..):

http://nim-lang.org/docs/nep1.html

Naming Conventions


* Type identifiers should be in PascalCase. All other identifiers should be in camelCase with the exception of constants which may use PascalCase but are not required to.
[..]

For constants coming from a C/C++ wrapper, ALL_UPPERCASE are allowed, but ugly. (Why shout CONSTANT? Constants do no harm, variables do!)



http://nim-lang.org/


  • * A fast non-tracing garbage collector that supports soft real-time systems (like games).

  • * System programming features: Ability to manage your own memory and access the hardware directly. Pointers to garbage collected memory are distinguished from pointers to manually managed memory.

[..]

* Macros can modify the abstract syntax tree at compile time.
[..]
  • * Macros cannot change Nim's syntax because there is no need for it. Nim's syntax is flexible enough.

  • * Statements are grouped by indentation but can span multiple lines. Indentation must not contain tabulators so the compiler always sees the code the same way as you do.


https://en.wikipedia.org/wiki/Nim_(programming_language)

"Nim (formerly named Nimrod)
[..]

Language design

Influenced by

[..]
Lisp: Macro system, embrace the AST, homoiconicity
[..]


UFCS, a feature supported by Nim" [and D]:


https://en.wikipedia.org/wiki/Uniform_Function_Call_Syntax


"It has been proposed (as of 2016) for addition to C++ by Bjarne Stroustrup[3] and Herb Sutter, to reduce the ambiguous decision between

[..]

    // All the followings are correct and equivalent
    int b = first(a);
    int c = a.first();
    int d = a.first;
"

Páll Haraldsson

unread,
Oct 13, 2016, 8:07:18 AM10/13/16
to julia-users
On Sunday, October 9, 2016 at 9:59:12 AM UTC, Michael Borregaard wrote:

So when I came to julia I was struck by how structured the package ecosystem appears to be, yet, in spite of the micropackaging. [..] I think there are a number of reasons for this difference, but I also believe that a primary reason is the reliance on github for developing the package ecosystem from the bottom up, and the use of organizations.

Could be; my feeling is that Julia allows for better

https://en.wikipedia.org/wiki/Separation_of_concerns [term "was probably coined by Edsger W. Dijkstra in his 1974 paper "On the role of scientific thought" "; synonym for "modularity"?]

that other languages, OO (and information hiding) has been credited as helping, but my feeling is that multiple dispatch is even better, for it.


That is, leads to low:

https://en.wikipedia.org/wiki/Coupling_(computer_programming)
"Coupling is usually contrasted with cohesion. Low coupling often correlates with high cohesion, and vice versa. Low coupling is often a sign of a well-structured computer system and a good design"


https://en.wikipedia.org/wiki/Cohesion_(computer_science)

Now, as an outsider looking in, e.g. on:

https://en.wikipedia.org/wiki/Automatic_differentiation

There seems to be lots of redundant packages with e.g.

https://github.com/denizyuret/AutoGrad.jl


Maybe it's just my limited math skills showing, are there subtle differences, explaining are requiring all these packages?

Do you expect some/many packages to just die?

One solution to many similar packages is a:

https://en.wikipedia.org/wiki/Facade_pattern

e.g. Plots.jl and then backends (you may care less about(?)).


Not sure when you use all these similar (or complementary?) packages together.. if it applies.


In my other answer I misquoted (making clear original user's comment is quoting

Style Insensitive?
https://github.com/nim-lang/Nim/issues/521
>Nimrod is a style-insensitive language. This means that it is not case-sensitive and even underscores are ignored: type is a reserved word, and so is TYPE or T_Y_P_E. The idea behind this is that this allows programmers to use their own preferred spelling style and libraries written by different programmers cannot use incompatible conventions. [..]

Jeffrey Sarnoff

unread,
Oct 14, 2016, 7:17:36 AM10/14/16
to julia-users
first pass at naming guidelines https://github.com/JuliaPraxis/Naming

Jeffrey Sarnoff

unread,
Oct 14, 2016, 3:15:44 PM10/14/16
to julia-users
Just clarifying: For a two part package name that begins with an acronym and ends in a word   
  
the present guidance:   
     the acronym is to be uppercased and the second word is to be capitalized, no separator.  
     so: CSSScripts, HTMLLinks  

the desired guidance (from 24hrs of feedback):   
     the acronym is to be titlecased and the second word is to be capitalized, no separator.   
     so: CssScripts, HtmlLinks

What is behind the present guidance?


On Saturday, October 8, 2016 at 8:42:05 AM UTC-4, Jeffrey Sarnoff wrote:

Jeffrey Sarnoff

unread,
Oct 24, 2016, 1:38:39 PM10/24/16
to julia-users
update on package names that begin with an acronym .. following much discussion, the rule which a strong preponderance of participants favor:
   the acronym is to be uppercased and the following words camelcased, no separator.
   so: CSSscripts, HTMLlinks, XMLparser.

This does not match the current docs:
     the acronym is to be uppercased and the second word is to be capitalized, no separator.  
     so: CSSScripts, HTMLLinks, XMLParser  

The reasoning I found most persuasive is that the current docs' rule
undermines Julia's developing reputation for expressive clarity. 

Jeffrey Sarnoff

unread,
Oct 24, 2016, 5:40:11 PM10/24/16
to julia-users
Actually, all is good.  The current docs do not take a stand on the use of case following an acronym.  
Reply all
Reply to author
Forward
0 new messages