Directory layout of contrib/

74 views
Skip to first unread message

Patrick Lawson

unread,
Feb 7, 2015, 2:12:51 PM2/7/15
to pants-devel
At the summit, we concluded that a `contrib` directory in the pantsbuild.pants repo would act as a home for plugins and utilities that didn't quite fit into core pants, or which were still new enough that we wanted to delineate them from mature backends.

I'm about to put some time into open sourcing and upstreaming buildgen (another action item from the summit), and I want to add it to this directory.  Here's my proposed layout (using buildgen as an example):

contrib/buildgen/{src,test}/{jvm,python,etc}/

Concretely for jvm:

contrib/buildgen/{src,test}/jvm/com/pants/contrib/buildgen/...

Concretely for python:

contrib/buildgen/src/python/pants/contrib/buildgen/...
and
contrib/buildgen/test/python/pants_test/contrib/buildgen/...


With the big questions being:
  * Should {src,test} be under contrib/{subproject}?  I'm voting yes since it localizes a contributed project's code into a single subdirectory, making it easier to read and navigate.  The downside of this is that it adds a couple of source roots per contributed project.

* Should python code in contributed projects be under the pants namespace package, in particular pants.contrib?  I'm also going with yes since it will play well with our intended goal of distributing modular plugins for pants under the pants namespace.  But namespace packages in python can be tricky, so I'm interested in hearing opinions on this.

Any other thoughts?

John Sirois

unread,
Feb 7, 2015, 7:47:25 PM2/7/15
to Patrick Lawson, pants-devel
On Sat, Feb 7, 2015 at 12:12 PM, Patrick Lawson <p...@foursquare.com> wrote:
At the summit, we concluded that a `contrib` directory in the pantsbuild.pants repo would act as a home for plugins and utilities that didn't quite fit into core pants, or which were still new enough that we wanted to delineate them from mature backends.

I'm about to put some time into open sourcing and upstreaming buildgen (another action item from the summit), and I want to add it to this directory.  Here's my proposed layout (using buildgen as an example):

contrib/buildgen/{src,test}/{jvm,python,etc}/

Concretely for jvm:

contrib/buildgen/{src,test}/jvm/com/pants/contrib/buildgen/...

Concretely for python:

contrib/buildgen/src/python/pants/contrib/buildgen/...
and
contrib/buildgen/test/python/pants_test/contrib/buildgen/...


With the big questions being:
  * Should {src,test} be under contrib/{subproject}?  I'm voting yes since it localizes a contributed project's code into a single subdirectory, making it easier to read and navigate.  The downside of this is that it adds a couple of source roots per contributed project.

I'd personally prefer:
contrib/
  src/python/
    buildgen/
    scroogegen/
  src/jvm/com/pants/
    buildgen/
    scroogegen/

AKA a single source-tree layout for all contrib - *BUT* - your proposal will almost certainly more friendly to the vast majority of contibutors... so sgtm.

 

* Should python code in contributed projects be under the pants namespace package, in particular pants.contrib?  I'm also going with yes since it will play well with our intended goal of distributing modular plugins for pants under the pants namespace.  But namespace packages in python can be tricky, so I'm interested in hearing opinions on this.

They are tricky, but pants must be able to handle them since pants users will use them even if we don't.  So I'm again +1
 

Any other thoughts?

Benjy Weinberger

unread,
Feb 7, 2015, 8:00:18 PM2/7/15
to John Sirois, Patrick Lawson, pants-devel
On Sat, Feb 7, 2015 at 4:47 PM, John Sirois <john....@gmail.com> wrote:


On Sat, Feb 7, 2015 at 12:12 PM, Patrick Lawson <p...@foursquare.com> wrote:
At the summit, we concluded that a `contrib` directory in the pantsbuild.pants repo would act as a home for plugins and utilities that didn't quite fit into core pants, or which were still new enough that we wanted to delineate them from mature backends.

I'm about to put some time into open sourcing and upstreaming buildgen (another action item from the summit), and I want to add it to this directory.  Here's my proposed layout (using buildgen as an example):

contrib/buildgen/{src,test}/{jvm,python,etc}/

Concretely for jvm:

contrib/buildgen/{src,test}/jvm/com/pants/contrib/buildgen/...

Concretely for python:

contrib/buildgen/src/python/pants/contrib/buildgen/...
and
contrib/buildgen/test/python/pants_test/contrib/buildgen/...


With the big questions being:
  * Should {src,test} be under contrib/{subproject}?  I'm voting yes since it localizes a contributed project's code into a single subdirectory, making it easier to read and navigate.  The downside of this is that it adds a couple of source roots per contributed project.

I'd personally prefer:
contrib/
  src/python/
    buildgen/
    scroogegen/
  src/jvm/com/pants/
    buildgen/
    scroogegen/

AKA a single source-tree layout for all contrib - *BUT* - your proposal will almost certainly more friendly to the vast majority of contibutors... so sgtm.

I agree with John that a single layout for all contrib is preferable. Otherwise we all have to keep modifying our IDE settings. In general, proliferation of source roots is one of the things Pants is great at preventing in our monorepos. It's at least partly why we're all using it. So I think we should practice what we preach. 
 

 

* Should python code in contributed projects be under the pants namespace package, in particular pants.contrib?  I'm also going with yes since it will play well with our intended goal of distributing modular plugins for pants under the pants namespace.  But namespace packages in python can be tricky, so I'm interested in hearing opinions on this.

They are tricky, but pants must be able to handle them since pants users will use them even if we don't.  So I'm again +1

I'd prefer to use pants_contrib as the top-level namespace, and avoid a world of hurt.

 
 

Any other thoughts?


Andy Reitz

unread,
Feb 7, 2015, 11:29:39 PM2/7/15
to Benjy Weinberger, John Sirois, Patrick Lawson, pants-devel
On Sat, Feb 7, 2015 at 4:59 PM, Benjy Weinberger <be...@foursquare.com> wrote:


On Sat, Feb 7, 2015 at 4:47 PM, John Sirois <john....@gmail.com> wrote:


On Sat, Feb 7, 2015 at 12:12 PM, Patrick Lawson <p...@foursquare.com> wrote:
At the summit, we concluded that a `contrib` directory in the pantsbuild.pants repo would act as a home for plugins and utilities that didn't quite fit into core pants, or which were still new enough that we wanted to delineate them from mature backends.

I'm about to put some time into open sourcing and upstreaming buildgen (another action item from the summit), and I want to add it to this directory.  Here's my proposed layout (using buildgen as an example):

contrib/buildgen/{src,test}/{jvm,python,etc}/

Concretely for jvm:

contrib/buildgen/{src,test}/jvm/com/pants/contrib/buildgen/...

Concretely for python:

contrib/buildgen/src/python/pants/contrib/buildgen/...
and
contrib/buildgen/test/python/pants_test/contrib/buildgen/...


With the big questions being:
  * Should {src,test} be under contrib/{subproject}?  I'm voting yes since it localizes a contributed project's code into a single subdirectory, making it easier to read and navigate.  The downside of this is that it adds a couple of source roots per contributed project.

I'd personally prefer:
contrib/
  src/python/
    buildgen/
    scroogegen/
  src/jvm/com/pants/
    buildgen/
    scroogegen/

AKA a single source-tree layout for all contrib - *BUT* - your proposal will almost certainly more friendly to the vast majority of contibutors... so sgtm.

I agree with John that a single layout for all contrib is preferable. Otherwise we all have to keep modifying our IDE settings. In general, proliferation of source roots is one of the things Pants is great at preventing in our monorepos. It's at least partly why we're all using it. So I think we should practice what we preach. 

Correct me if I'm wrong, but it seems like Patrick is essentially proposing that we use maven_layout()  for projects in contrib. We use that quite heavily at Twitter, and are quite comfortable with it. In particular, my understanding is that the pants plugin for IntelliJ has strong support for multiple source roots (i.e. maven_layout()).

So I'm +1 on making each contrib a self-contained directory. I think it will make it easier for newcomers to see all of the files entailed in a contributed project.

-Andy.
 

John Sirois

unread,
Feb 7, 2015, 11:44:08 PM2/7/15
to Andy Reitz, Benjy Weinberger, Patrick Lawson, pants-devel
I don't think the current support of the intellij plugin is a great argument.  It must support single source trees well too.  To Benjy's point pants encourages single source trees and is arguably most natural to use with them.
That said, I'm sympathetic to the vast ingraining of "project" and maven style layout by the popularity of maven out in the world.  We will likely have a good deal of users and contributors coming from that mindset and world.

The choice here then boils down to favoring pants options or the expectations of folks used to maven.  There are arguments to be made for either, but thats the argument that must be made.  Why is it better to steer folks to pants worldview.  Why is it better to bend and accomodate maven's world view instead.

Patrick Lawson

unread,
Feb 8, 2015, 12:01:06 AM2/8/15
to John Sirois, Andy Reitz, Benjy Weinberger, pants-devel
I disagree that this is really about maven vs. pants.  In an internal monorepo, the desire to have a single src and test directory stems largely from the fact that the code within that repo is intended to be seen as an indivisible unit.  On the other hand, the specific point of contrib projects is that they (1) potentially start outside the repo and get integrated later--and we don't want to add friction to that integration and (2) can be clearly separated when editing (and to be clear, I don't think of intellij or whatever when I say this, just directory layout).

I certainly think that within pantsbuild.pants/src we should behave as a monorepo; but within contrib/ it makes sense to call out the fact that these are heterogeneous projects that specifically haven't been integrated into the core.  Integration into the core itself (what we now call backends) is equivalent to this inclusion in the monorepo, and should happen after we've determined the project is mature and fits well as a core backend or feature.  After all, we've _already_ decided to go with another top level source root by choosing to maintain a separate contrib/ dir.

So I guess I'll double down on my original "I like this but it adds source roots".  I'm not particularly worried about a finite number of source roots mapping one to one with a small number of contrib projects--and I don't think maven vs. pants vs. intellij plugin is something that should really factor into this.

John Sirois

unread,
Feb 8, 2015, 12:06:26 AM2/8/15
to Patrick Lawson, Andy Reitz, Benjy Weinberger, pants-devel
On Sat, Feb 7, 2015 at 10:01 PM, Patrick Lawson <p...@foursquare.com> wrote:
I disagree that this is really about maven vs. pants.  In an internal monorepo, the desire to have a single src and test directory stems largely from the fact that the code within that repo is intended to be seen as an indivisible unit.  On the other hand, the specific point of contrib projects is that they (1) potentially start outside the repo and get integrated later--and we don't want to add friction to that integration and (2) can be clearly separated when editing (and to be clear, I don't think of intellij or whatever when I say this, just directory layout).

Here's the thing - if you treat a contrib as a vertical it will be a vertical.  Joe contribs console helper foo with innovation bar that stays in the foo vertical.  Jane contributes baz that could use the bar innovation but she either doesn't discover it because - verticals, or she does and is intimidated or too lazy to factor bar into a shareable target.  This story is not contrib special, its true of any repo with maven style (vertical) layout - IMO.

Not saying this tips me either way, just pointing out what I've seen to be a truth of the layout styles over the years.

David Taylor

unread,
Feb 8, 2015, 1:13:38 AM2/8/15
to John Sirois, Patrick Lawson, Andy Reitz, Benjy Weinberger, pants-devel
My vote is for one directory under contrib per plugin, and more freedom for plugins to define their own internal structure.

I think that optimizing for something other than code reuse in contrib is probably okay since generally plugins should be built out of reusable components _from core_, not from other plugins. I think that ease of contribution is a bigger factor for plugins — I’d guess than many plugins will start as site-specific or internal functionality somewhere and then make their way upstream, and being able to drop them as whole in a given place seems easier to me than having to break them up and scatter them around.

—davidt

Eric Zundel Ayers

unread,
Feb 8, 2015, 6:58:43 AM2/8/15
to Benjy Weinberger, John Sirois, Patrick Lawson, pants-devel

I will pile on to just having one 'src' and 'test' directory under contrib. Besides the reasons already mentioned, there is convention established as that is how the other 'examples' and 'testprojects' directories are setup.

Speaking of convention, I've never understood why we change ‘pants’ to ‘pants_test’ under test directories, like this:

src/python/pants/…
test/python/pants_test/…

why not just

test/python/pants/...

?



mateor

unread,
Feb 8, 2015, 7:03:46 AM2/8/15
to pants...@googlegroups.com
I was just looking this up the other day. Here is a conversation Larry and Benjy had about it.

https://rbcommons.com/s/twitter/r/199/

Benjy Weinberger

unread,
Feb 8, 2015, 12:08:16 PM2/8/15
to Eric Zundel Ayers, John Sirois, Patrick Lawson, pants-devel
On Sun, Feb 8, 2015 at 3:58 AM, Eric Zundel Ayers <zun...@squareup.com> wrote:

I will pile on to just having one 'src' and 'test' directory under contrib. Besides the reasons already mentioned, there is convention established as that is how the other 'examples' and 'testprojects' directories are setup.

Speaking of convention, I've never understood why we change ‘pants’ to ‘pants_test’ under test directories, like this:

src/python/pants/…
test/python/pants_test/…

why not just

test/python/pants/...

Unlike the JVM, Python (before 3.3) doesn't intrinsically support a logical package split across two filesystem locations. If you tried to `from pants import foo`, where the relevant code was in test/python/pants/foo, but src/python was earlier on the PYTHONPATH, Python would descend as far as src/python/pants, see that src/python/pants/foo doesn't exist, and throw an error. It wouldn't continue to search the rest of the path.

There is a way around this, known as "namespace packages", in which you add one of a few magic incantations in all the __init__.py files along the way, to manipulate the module's __path__. However these incantations are unofficial hacks, and this method is not blessed by the Python powers-that-be. the PEPs proposing them have been rejected.

Sometimes this is unavoidable - e.g., when distributing code on pypi, would you want to give every distribution a different top-level package name? But it still adds complexity, so we avoid it where possible.

Python 3.3 implements PEP420, which is now the officially blessed way of doing namespace packages.

Patrick Lawson

unread,
Feb 8, 2015, 12:44:02 PM2/8/15
to Benjy Weinberger, Eric Zundel Ayers, John Sirois, pants-devel

I will pile on to just having one 'src' and 'test' directory under contrib. Besides the reasons already mentioned, there is convention established as that is how the other 'examples' and 'testprojects' directories are setup.

The difference is that the other examples and testprojects are indeed demonstrations of "hey here's a monorepo, it's how you should use pants."  Modules in contrib will instead be plugins to pants, which are almost certainly going to be distributed differently, and just generally are intended to be more isolated.  Except in rare circumstances (admittedly buildgen is one), I would expect a typical third party plugin to look like a regular python project--with a setup.py, normal package layout, and in particular, not using pants at all.  Dependencies on pants itself would be expressed less granularly as a dependency on a particular set of pypi distributed pants plugins (and of course core libs).
 

Here's the thing - if you treat a contrib as a vertical it will be a vertical.  Joe contribs console helper foo with innovation bar that stays in the foo vertical.  Jane contributes baz that could use the bar innovation but she either doesn't discover it because - verticals, or she does and is intimidated or too lazy to factor bar into a shareable target.  This story is not contrib special, its true of any repo with maven style (vertical) layout - IMO.

To be honest, I consider this a feature, not a bug.  The point of contrib is that something doesn't fit nicely or obviously into core pants (yet, at least).  We've been burned before by prematurely trying to force code reuse into our model instead of letting a few (generally trivial) pieces of code float around as copy-paste.  Someone adding a contrib project could explicitly depend on this piece of general code (and add a BUILD dep, which should somehow propagate to a pypi dep), but probably they shouldn't.  They could refactor that code into core--but that should be done with care.

When it's clear that a utility belongs in core, _someone_ has to do it, and the fact that contrib looks like a monorepo doesn't make that process any easier.  In my mind, the point of contrib is to put an explicit barrier around monorepo style dependencies, because we are overtly weary of letting contrib projects either depend "loosely" on each other, or especially to let core pants depend on contrib projects at all.

In the end, I think it's important to remember that contrib was considered a nicer alternative to having a separate repo (or repos), because contrib keeps testing localized in a nice way.  We had already decided that we wanted a group of verticals to act as examples of writing plugins and also as a staging ground for integrating code into core pants gradually and carefully.  I'm wary of blurring this code into the monorepo, when it is explicitly supposed to behave as though it came from an outside party.

John Sirois

unread,
Feb 8, 2015, 12:50:12 PM2/8/15
to Patrick Lawson, pants-devel, Benjy Weinberger, Eric Zundel Ayers

Lots of good arguments, so implementers wins.  The one who steps up to do the work has a stronger weighting!  I'd go for you proposal Patrick and burn down the bikeshed before it gets its roof.

Benjy Weinberger

unread,
Feb 8, 2015, 8:10:05 PM2/8/15
to John Sirois, Patrick Lawson, pants-devel, Eric Zundel Ayers
Seconded.
Reply all
Reply to author
Forward
0 new messages