Sharing java-buildpack infrastructure for other buildpacks such as php ?

75 views
Skip to first unread message

Guillaume Berche

unread,
Mar 31, 2014, 6:49:41 AM3/31/14
to vcap...@cloudfoundry.org
The java-buildpack has great infrastructure that at a first glance could be useful for other buildpacks:
- artifacts download & caching & upload [3]
- configuration
- logging & debugging/diagnostics
- integration testing [2]

In addition, this infrastructure has great test coverage and documentation.

I wonder if it would make sense to try to share/reuse this logic for other buildpacks such as the php buidlpack [4] contributed by Daniel or the node.js buildpack.

What would be the way the java team recommend to reuse such logic from the java-buildpack while keeping maintenance cost reasonable ? Any existing experience on such reuse to inspire from ?
a- consider the java-buildpack as an external dependency (e.g. using git remote) and import some of its classes
b- refactor the javabuildpack so that generic support can be published independently and reused by other buildpacks
c- fork the java-buildpack and tweak the java logic to fit the php logic
d- other?

As I need to work with php apps, and need to make some modifs to Daniel's buildpack, I might be trying that in the coming weeks. Let me know Daniel if this is something you could be interested in. I realize this would require to migrate existing logic from python/bash to ruby.

Thanks in advance,

Guillaume.

[1] https://github.com/cloudfoundry/java-buildpack
[2] https://github.com/cloudfoundry/java-buildpack-system-test/
[3] https://github.com/cloudfoundry/java-buildpack-dependency-builder
[4] https://github.com/dmikusa-pivotal/cf-php-build-pack
[5] https://groups.google.com/a/cloudfoundry.org/d/msg/vcap-dev/ZVQ6huhilus/Fl94FxqLf6EJ

Guillaume Berche

unread,
Mar 31, 2014, 7:23:12 AM3/31/14
to vcap...@cloudfoundry.org
Oups, I had missed Daniel that you had rewrote the php buildpack completely and now have similar support than the java-buildpack for caching, logging ... along with associated tests. This reuse suggestion is therefore making probably less sense at this stage?

Guillaume.

Ben Hale

unread,
Mar 31, 2014, 8:07:58 AM3/31/14
to vcap...@cloudfoundry.org
Guillaume,

There's actually a new Cloud Foundry Buildpacks team that's been spun up in the New York area to answer just these kinds of questions.  At the moment, their main goal is to simply get our forks of the Heroku buildpacks up to date (some of them are lagging by many commits).  Beyond that, there are some interesting discussion to be had.  In the old, old days, I'd always planned on making bits of the Java Buildpack re-usable (there was, for a short time a Buildpack Utilities Git repository), and as you've pointed out, we've done our best to keep the code re-usable even as it exists in the Java Buildpack repository.  We're hamstrung a bit by the fact that buildpacks do not have to be written in Ruby and therefore the Runtime team is reticent to add Ruby-specific behaviors to staging (the entire NodeJS buildpack is written in Bash).  I wasn't keen on this decision initially, but looking back with hindsight, I think it was a good one.

A typical scenario would be for us to pull out re-usable code into gems, and use a Gemfile[.lock] combined with a bundle install call (if those files existed) to pull in dependencies.  There are downsides to this, in that downloading those gems for each staging (maybe we could get some nice caching in the application cache?) would slow down the staging process.  An alternative is to use the bundle install --deployment call to install the gems instal vendor/bundle.  This has the advantage of keeping the dependencies with the source code (removing the need to download each time), but would bloat the buildpack repos quite a bit.

So, since there hasn't been a whole lot of need to share up to this point (and official still isn't), we haven't made any decision at all.  This has allowed us to hone the APIs and implementations of that sharable code (which is nice), and after the other buildpacks are stabilized, it might be worthwhile to re-open the discussion.  After an initial investigation indicated that a Ruby buildpack with a more extensive collection of Rubies might reach 1GB, I suspect that the priority of this might be a bit higher in the past.

Thanks for bringing up the issue of duplicating this support; it's definitely worth thinking about.  Daniel, any thought's from you?


-Ben Hale
Cloud Foundry Java Experience
Message has been deleted

glyn.no...@gmail.com

unread,
Mar 31, 2014, 9:52:53 AM3/31/14
to vcap...@cloudfoundry.org
Sharing infrastructure code between buildpacks is a great idea: improvements to the infrastructure would be easier to roll out to multiple buildpacks, there would be less total code to maintain, and documentation and other externals such as log formats would be more consistent between buildpacks.

Clearly this would require a significant investment. Before anyone embarks on that, we need to be clear what language to converge upon. Based on my experience developing the Java buildpack, I'd say Ruby is preferable to bash. But I'm still nervous about Ruby's lack of properly defined/documented semantics and its tendency to ship incompatible changes in minor version updates. Go is rather better than Ruby in these respects but has the downside that code needs recompiling for each target operating system and machine architecture, which could become onerous (especially for developers of new buildpacks or forks of existing buildpacks) as the number of OS's and architectures supported by Cloud Foundry gradually increases. Java would provide a good compromise between stability and portability, but writing buildpacks in Java would turn off many developers.

It would be good if the choice of language was a conscious decision rather than a historical accident. I'm in danger of beginning a bikeshed discussion here - possibly it's too late. But given the conversion to Go in Diego, we should consider Go as a serious contender for writing buildpacks. If the infrastructure was written suitably in Go, it might even be possible for buildpacks written in other languages to reuse the infrastructure.

Ben Hale

unread,
Mar 31, 2014, 9:57:32 AM3/31/14
to vcap...@cloudfoundry.org, glyn.no...@gmail.com
I actually think that Heroku's API of three simple bash entry points is a great solution to this problem.  Today, there is nothing to stop you from writing a buildpack in Go.  Aligning with that, I don't think there there should be any requirement to use shared utility code either; it should simply be available if you want to.

The key here is rather than mandating a language or infrastructure APIs (e.g. caching, configuration) let the most compelling choices organically grow.  If Go really is a better language for writing buildpacks then it will win.  If it turns out that Ruby (or Node, or Python) is better, than those will win.  We're still way too early to pick a winner.


-Ben Hale
Cloud Foundry Java Experience

Glyn Normington

unread,
Mar 31, 2014, 10:18:01 AM3/31/14
to Ben Hale, vcap...@cloudfoundry.org
On 31/03/2014 14:57, Ben Hale wrote:
I actually think that Heroku's API of three simple bash entry points is a great solution to this problem.  Today, there is nothing to stop you from writing a buildpack in Go.  Aligning with that, I don't think there there should be any requirement to use shared utility code either; it should simply be available if you want to.
I'm not sure what you mean by "this problem". I agree that the three Heroku entry points are a good external interface to buildpacks, but they don't solve the problem of sharing infrastructure between buildpacks.


The key here is rather than mandating a language or infrastructure APIs (e.g. caching, configuration) let the most compelling choices organically grow.  If Go really is a better language for writing buildpacks then it will win.  If it turns out that Ruby (or Node, or Python) is better, than those will win.  We're still way too early to pick a winner.
If shared buildpack infrastructure is developed, the chances are it will indeed stem organically from one of the buildpacks which currently has suitable infrastructure and so would be written in the same language. But sometimes it is worth taking stock and breaking with tradition. I'm more concerned that this be a conscious decision based on available understanding than that a particular language be chosen.

Ben Hale

unread,
Mar 31, 2014, 10:22:01 AM3/31/14
to vcap...@cloudfoundry.org, Ben Hale, glyn.no...@gmail.com
I actually think that Heroku's API of three simple bash entry points is a great solution to this problem.  Today, there is nothing to stop you from writing a buildpack in Go.  Aligning with that, I don't think there there should be any requirement to use shared utility code either; it should simply be available if you want to.
I'm not sure what you mean by "this problem". I agree that the three Heroku entry points are a good external interface to buildpacks, but they don't solve the problem of sharing infrastructure between buildpacks.
The problem of picking which language to write the buildpacks (and implicitly, the shared infrastructure) in.  I believe that answer is that we shouldn’t pick any.  If there is code that can be shared, share it.  If that code is popular, more buildpacks will be built to use it.  The language of that infrastructure will then become the winner.  I don’t think that we can consciously pick a language today; we simply don’t have enough of a view of the future.

Glyn Normington

unread,
Mar 31, 2014, 11:39:10 AM3/31/14
to Ben Hale, vcap...@cloudfoundry.org
I think we should choose a language to *define* the future. ;-)

Daniel Mikusa

unread,
Mar 31, 2014, 3:04:04 PM3/31/14
to vcap...@cloudfoundry.org
Ben,

In general, I think sharing code across build packs is a good idea. Makes life easier for build pack developers and helps to raise the overall quality of build packs for CF. It would be nice if there was an easier way to share code, but as Ben pointed out, short of an API change an author's options are limited.

As far as the PHP build pack, when I re-wrote it I tried to pull out functionality that I thought would be useful for other build pack into a common library. It’s on Github here.

https://github.com/dmikusa-pivotal/py-cf-buildpack-utils

It has two levels of functionality. A lower level for basic things like caching, downloading, extracting files, etc, and a higher level builder, which encapsulates the lower level functionality but in what is hopefully an easier use and easier to understand API.

With the PHP build pack, I opted to include the dependent library with the build pack. I did this because there is only one dependency and it has a small footprint. So far, this has worked out good for me.


Guillaume,

Hope that answers your questions. If you have any feed back on the PHP build pack, just open a Github issue or post it here.

Dan
> To unsubscribe from this group and stop receiving emails from it, send an email to vcap-dev+u...@cloudfoundry.org.

Dr Nic Williams

unread,
Mar 31, 2014, 3:27:16 PM3/31/14
to vcap...@cloudfoundry.org
Daniel, that's great of you to build out some sharable tools/framework for buildpacks; especially the cachable assets part which will be important for any CF-orientated buildpacks that want to be installable into CF's that have no internet access.
--
Dr Nic Williams
Stark & Wayne LLC - consultancy for Cloud Foundry users
twitter @drnic

Guillaume Berche

unread,
Apr 1, 2014, 9:19:37 AM4/1/14
to vcap-dev
Thanks Ben, Daniel, and Glyn for sharing your views on this and for your efforts to provide reuseable code other buildpacks to reuse.

It's a bit harder for cf users and potential contributors to keep up with a wide variety of languages in the cf ecosystem (go, ruby, python, bash) with each their own set of tools (dependency mgt, test/mock frameworks...), but I guess that's also what makes a rich, open and diverse community.

I'm wondering whether some generic features that were recently brought by buildpacks could be moved into the cf bosh release, or in common development tools, so that they benefit to all newly created buildpacks, e.g.

- could the buildpack cache feature be a standalone HTTP caching proxy (along with the debugging traces showing the buildpack cache content when debugging is enabled)
- some of the diagnostics developments in the java buildpack be moved to the dea (e.g. displaying last git commits for a custom git repo when debugging is enabled)
- enabling buildpack debugging traces could be standardized across buildpacks and potentially support added to cf cli and dea ?
- could the integration testing contributed by java buildpack [2] be used as generic testsuite for buildpacks, verifying the buildpack properly startup an application, and possibly inject properly credentials so that apps can use them ?


Just random thoughts...

Guillaume.


Daniel Grippi

unread,
Apr 1, 2014, 11:59:22 AM4/1/14
to vcap...@cloudfoundry.org
All,

Great thread.  I've added some suggestions inline.  Let me know your thoughts.

Daniel


On Tue, Apr 1, 2014 at 9:19 AM, Guillaume Berche <ber...@gmail.com> wrote:
Thanks Ben, Daniel, and Glyn for sharing your views on this and for your efforts to provide reuseable code other buildpacks to reuse.

It's a bit harder for cf users and potential contributors to keep up with a wide variety of languages in the cf ecosystem (go, ruby, python, bash) with each their own set of tools (dependency mgt, test/mock frameworks...), but I guess that's also what makes a rich, open and diverse community.

Language diversity absolutely causes a barrier to entry.  If we were the only ones making buildpacks, I'd be inclined to decide on one language and stick with it; however, and as you've stated, that's not our call.

Heroku has been doing some stellar work on maintaining their buildpacks (the Ruby one gets lots of love), and I'd like to keep in line with all the existing work they've done.  No sense in diverging efforts when we could be working together.  That said, their major buildpacks are written in Bash (minus their Ruby one written in Ruby).


I'm wondering whether some generic features that were recently brought by buildpacks could be moved into the cf bosh release, or in common development tools, so that they benefit to all newly created buildpacks, e.g.

- could the buildpack cache feature be a standalone HTTP caching proxy (along with the debugging traces showing the buildpack cache content when debugging is enabled)

We were thinking along the same lines -- using a Squid Proxy in lieu of directly manipulating buildpacks to achieve this; however, it's to the best of my knowledge that we're making a push to bundle dependencies within buildpacks themselves.

I know the Java buildpack has an option to point to a source directory when compiling in "Expert" mode.  If we could upload source binaries into some on-prem (or off-prem) file store, decoupling binary dependencies from the zipped buildpacks themselves, that would be *fantastic*.  James / Mark -- Could you weigh in on this as being a feasible option?
 
- some of the diagnostics developments in the java buildpack be moved to the dea (e.g. displaying last git commits for a custom git repo when debugging is enabled)

+1
 
- enabling buildpack debugging traces could be standardized across buildpacks and potentially support added to cf cli and dea ?
- could the integration testing contributed by java buildpack [2] be used as generic testsuite for buildpacks, verifying the buildpack properly startup an application, and possibly inject properly credentials so that apps can use them ?

The integration testing suite put together by the Java guys is awesome.  I do want to throw out an alternative in efforts to standardize with what Heroku already has in place for multiple buildpacks:

Consider making Heroku's testing tool Anvil more generic and standardizing on that.  Heroku uses this framework to test buildpacks independently of BP language (third-party buildpacks are written not just in Bash. -- i.e. Lisp is written in Lisp.)

James Bayer

unread,
Apr 1, 2014, 12:15:48 PM4/1/14
to vcap...@cloudfoundry.org
this is a great thread.

improving the buildpack ecosystem in the macro and making it easier to package buildpacks with disconnected/offline CF instances or even just significantly improved performance of buildpacks for internet connected CF instances is a great outcome. for buildpack authors and CF operators, making it more simpler to develop, maintain, test, distribute, and update buildpacks is a goal we should reach for.

i need to understand the various proposals in more detail to offer much more. i'm away on vacation until friday. mark may have some time to look into this, i'm not sure.

personally, i think it might be time to move some of this discussion to a design document where collaborative commenting/editing can take place with open comments rather than trying to follow it all in an email thread. the design docs are here [1] and there is the older original CF is migrating to buildpacks doc [2] that could either be updated or we could start a new doc if others are interested in working it in a doc kind of way. i'm looking forward to working through these issues and making improvements.

Thank you,

James Bayer
Reply all
Reply to author
Forward
0 new messages