Ruby environment variable handling in Puppet Server

Jeremy Barlow

unread,

Apr 13, 2015, 5:06:37 PM4/13/15

to puppe...@googlegroups.com

Hello all,

The Puppet Server team is hoping to start working soon on https://tickets.puppetlabs.com/browse/SERVER-297 but would first like to solicit feedback on our intended approach. SERVER-297 concerns how Puppet Server handles environment variables with respect to the Ruby code it executes.

For some background…

During the initial development of Puppet Server, a conscious decision was made to shield any Ruby code being executed within Puppet Server’s Java process from directly inheriting any shell environment variables. The primary factor motivating this decision was a fear about the potential contamination of the Puppet Server JRuby runtime by shell environment variables configured for execution under MRI Ruby.

For example, gems with native C extensions require different implementations for the MRI vs. JRuby execution environments. Assuming the default value for a system's GEM_HOME environment variable, if defined, would most commonly contain a path under which MRI-compatible gems reside, we thought it would be better for the JRuby-based Puppet Server to maintain its own setting for this path which would be completely independent from the shell environment. This is captured under a setting called “gem-home” in the “jruby-puppet” section of the “puppetserver.conf” file. Effectively, the value for this “gem-home” setting is injected into the configuration of the JRuby container in Puppet Server such that Ruby code sees it as the value of the GEM_HOME "environment variable”.

In current Puppet Server code, GEM_HOME is the only “environment variable” that any Ruby code would see. All other environment variables which may have been active in the shell under which the Puppet Server process was launched - PATH, USER, etc. - are removed / not made visible to any Ruby code running within the Puppet Server process.

We realized later, however, that by not providing a mechanism for Puppet Server to configure other environment variables into a JRuby container that some Ruby code would be impossible to configure under Puppet Server. For example, assume that your Puppet code needed to make use of a gem that is designed to only be able to read its configuration from an environment variable called FOO. Under current Puppet Server code, it would not be possible for the gem to get a value for this variable.

To address this use case while preserving the ability to avoid contaminating the JRuby environment with content intended only for use with MRI Ruby, we’ve discussed the possibility of adding an “environment variable” map to the “jruby-puppet” section of the “puppetserver.conf” file.

For example, assume that an init script had defined a shell environment variable as….

export FOO=for_mri_ruby

… whereas the “jruby-puppet” section of Puppet Server’s “puppetserver.conf” file were to have:

  jruby-puppet {
    gem-home: /var/lib/puppet/jruby-gems 
    environment-variables: { 
      FOO: for_jruby 
    }
    ...
  }

When the gem mentioned earlier were to access ENV[‘FOO’] while running under Puppet Server, the gem would get a value of “for_jruby”. This would happen because the source of the environment variable for the JRuby container would be drawn from the configuration file as opposed to the actual shell environment.

Encapsulating the JRuby “environment variable” definition in a configuration file as opposed to the actual shell environment could also set a nice precedent for future use cases, purely speculative today, where individual JRuby containers might need to have completely unique characteristics from one another. For example, one JRuby container could maintain a set of “environment variables” appropriate for "Ruby 1.9.3” execution vs. another container, potentially running in the same Java process, could use a different set of variables appropriate for “Ruby 2.2”. Note that we don’t have any formal plans in place to develop toward this specific use case at this time.

While we are leaning toward a config-file driven approach, we would be interested in hearing of any specific use cases you may know of where this may be insufficient. We would specifically be interested in any use cases which suggest that some affordance in the design should be made to allow for some (or all?) variables seen by Ruby code to be drawn from the actual shell environment, as opposed to just a configuration file. As mentioned earlier, we’re a bit leery of opening up the possibility of having Ruby code running under JRuby inherit shell environment variables because of potential MRI / JRuby contamination issues. However, we want to ensure that we’re not being short-sighted by letting this concern preclude other valuable use cases.

Thanks!

— Jeremy

Ken Barber

unread,

Apr 14, 2015, 7:35:29 AM4/14/15

to puppe...@googlegroups.com

> While we are leaning toward a config-file driven approach, we would be
> interested in hearing of any specific use cases you may know of where this
> may be insufficient. We would specifically be interested in any use cases
> which suggest that some affordance in the design should be made to allow for
> some (or all?) variables seen by Ruby code to be drawn from the actual shell
> environment, as opposed to just a configuration file.

Might be clutching at straws here, but there might be a case for
something like http_proxy (which is a reasonably common convention) in
a closed environment that requires it, to be just passed through,
versus defining it also in another configuration file again. That kind
of environment var is _sometimes_ set globally to avoid configuring
the proxy config in all the different clients/services that a *nix box
has. I think Net::HTTP honors this environment variable for example,
so this might apply to some functions that make outbound http calls.

Of course, I'd rather here what the community has to say about this.
Maybe users would prefer to manage this more precisely instead of
globally anyway from a puppetserver/function perspective.

ken.

Nan Liu

unread,

Apr 14, 2015, 12:09:40 PM4/14/15

to puppe...@googlegroups.com

On Tuesday, April 14, 2015 at 4:35:29 AM UTC-7, Ken Barber wrote:

> While we are leaning toward a config-file driven approach, we would be
> interested in hearing of any specific use cases you may know of where this
> may be insufficient. We would specifically be interested in any use cases
> which suggest that some affordance in the design should be made to allow for
> some (or all?) variables seen by Ruby code to be drawn from the actual shell
> environment, as opposed to just a configuration file.

Might be clutching at straws here, but there might be a case for
something like http_proxy (which is a reasonably common convention) in
a closed environment that requires it, to be just passed through,
versus defining it also in another configuration file again. That kind
of environment var is _sometimes_ set globally to avoid configuring
the proxy config in all the different clients/services that a *nix box
has. I think Net::HTTP honors this environment variable for example,
so this might apply to some functions that make outbound http calls.

+1, http_proxy and no_proxy not being honored in puppet functions is one of the annoyances I've run into with puppet-server.

Of course, I'd rather here what the community has to say about this.
Maybe users would prefer to manage this more precisely instead of
globally anyway from a puppetserver/function perspective.

I'm fine explicitly setting environment variable for puppetserver if there's an option to passthrough:

environment-variables: {

FOO: $FOO

BAR: $BAR:-val

}

Thanks,

Nan

Jeremy Barlow

unread,

Apr 16, 2015, 11:58:04 AM4/16/15

to puppe...@googlegroups.com

Thanks for the responses on this thread so far and some of the corresponding discussion that has been spawned in the related JIRA ticket - https://tickets.puppetlabs.com/browse/SERVER-297.

Most of the discussion I've seen about this so far has centered on the lack of an ability for Puppet Server to use a proxy for HTTP/S communications when needed. This has been mostly with respect to the "puppetserver gem" command being unable to access gem repositories via a proxy - also covered in https://tickets.puppetlabs.com/browse/SERVER-377. Proxy support is an issue that clearly needs to be addressed, both for puppetserver CLI tools and the production Puppet Server stack. See https://tickets.puppetlabs.com/browse/SERVER-156 around the production Puppet Server stack's current lack of support for using a proxy.

We will certainly need to get to a solution that allows for values for the *_PROXY variables to be made available to Ruby code running in a JRuby container in Puppet Server. I think there's also a very reasonable case to be made that these specific variables be drawn from the actual shell environment, when not overridden by Puppet Server configuration, given that they are very commonly used to perform system-wide proxy configuration that many tools honor as defaults.

What is not clear to me at this point, though, is whether the *_PROXY variable case alone is enough to inform the more general approach that we take toward environment variable population into the JRuby containers - and specifically whether or not there is a requirement to provide a more-general purpose mechanism for choosing arbitrary variables to flow through from the shell environment to the JRuby container.

I'd like to hear if anyone has other use cases - ones not related to *_PROXY - for flowing environment variables from the shell to the JRuby container. Specifically, I'd like to hear of other use cases that a config-file driven approach like the one described in the initial post on this thread would not satisfy.

Thanks again!

--- Jeremy

Clayton O'Neill

unread,

Apr 16, 2015, 12:37:28 PM4/16/15

to puppe...@googlegroups.com

I'm not sure that we have a use case for the server, but as an aside, we do have a use case on the agent side, if that is ever a concern:

We use the ability to set environment variables when doing OpenStack upgrades. OpenStack has the concept of public (publicURL) and private (internalURL) endpoints. Normally during our Puppet runs it uses the public endpoints for all API calls and provisioning. However, when we're doing upgrades, we sometimes have the external load balancer disabled so that clients can not make changes to a database we may revert. In this case, we set the OS_ENDPOINT_TYPE environment variable to 'internalURL' in our upgrade scripts.

I don't know if this will ever be an issue for the agent, but in this situation a config based option would be significantly less useful.

Reply all

Reply to author

Forward