Alternative to facts-in-query

0 views
Skip to first unread message

Luke Kanies

unread,
Nov 25, 2009, 2:50:58 PM11/25/09
to puppe...@googlegroups.com
Looking at the problems resulting from us putting facts in the get
request, e.g. #2855, makes me again think this is the wrong approach.

This ability only exists because, when running multiple servers
without client binding, there's a chance that the facts get sent to a
different server than the catalog is retrieved from.

I'm thinking that it might be a better idea to solve this problem than
to hack around it. The main solution I'm thinking of is essentially
requiring some kind of shared back-end or requiring a shared cache
such as memcached.

A shared cache with memcached should be pretty close to trivial - just
another terminus type. This obviously adds another dependency, but
only in those cases where you 1) have multiple masters, 2) don't have
client binding to an individual master, and 3) aren't using some
common back-end (one of which will be available from us with this
information by the next major release).

Is this a reasonable approach? It's obviously not sufficient for
0.25.2, but I think it's a better long term direction.

--
I went to a restaurant that serves "breakfast at anytime". So I
ordered French Toast during the Renaissance. -- Stephen Wright
---------------------------------------------------------------------
Luke Kanies -|- http://reductivelabs.com -|- +1(615)594-8199

Nigel Kersten

unread,
Nov 25, 2009, 3:15:48 PM11/25/09
to puppe...@googlegroups.com
On Wed, Nov 25, 2009 at 11:50 AM, Luke Kanies <lu...@reductivelabs.com> wrote:
> Looking at the problems resulting from us putting facts in the get
> request, e.g. #2855, makes me again think this is the wrong approach.
>
> This ability only exists because, when running multiple servers
> without client binding, there's a chance that the facts get sent to a
> different server than the catalog is retrieved from.

It's not just the catalog retrieval though is it? It was also a
problem even if you set up a connection to Server A, received the
catalog from Server A, but then a subsequent file request is answered
by Server B, which may have no idea what environment your client is,
given that the environment isn't encoded in the puppet:/// source URI.


> I'm thinking that it might be a better idea to solve this problem than
> to hack around it.  The main solution I'm thinking of is essentially
> requiring some kind of shared back-end or requiring a shared cache
> such as memcached.
>
> A shared cache with memcached should be pretty close to trivial - just
> another terminus type.  This obviously adds another dependency, but
> only in those cases where you 1) have multiple masters, 2) don't have
> client binding to an individual master, and 3) aren't using some
> common back-end (one of which will be available from us with this
> information by the next major release).
>
> Is this a reasonable approach?  It's obviously not sufficient for
> 0.25.2, but I think it's a better long term direction.

So for me this all depends on how well it scales... happy to do tests.

Is it perhaps feasible to have the server tell the client to resubmit
the fact values to it if it doesn't have a fresh cache? I have this
nagging feeling I already brought up this in the past and it wasn't
feasible.

>
> --
> I went to a restaurant that serves "breakfast at anytime". So I
> ordered French Toast during the Renaissance. -- Stephen Wright
> ---------------------------------------------------------------------
> Luke Kanies  -|-   http://reductivelabs.com   -|-   +1(615)594-8199
>
> --
>
> You received this message because you are subscribed to the Google Groups "Puppet Developers" group.
> To post to this group, send email to puppe...@googlegroups.com.
> To unsubscribe from this group, send email to puppet-dev+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/puppet-dev?hl=en.
>
>
>



--
nigel

Luke Kanies

unread,
Nov 25, 2009, 3:21:01 PM11/25/09
to puppe...@googlegroups.com
On Nov 25, 2009, at 12:15 PM, Nigel Kersten wrote:

> On Wed, Nov 25, 2009 at 11:50 AM, Luke Kanies
> <lu...@reductivelabs.com> wrote:
>> Looking at the problems resulting from us putting facts in the get
>> request, e.g. #2855, makes me again think this is the wrong approach.
>>
>> This ability only exists because, when running multiple servers
>> without client binding, there's a chance that the facts get sent to a
>> different server than the catalog is retrieved from.
>
> It's not just the catalog retrieval though is it? It was also a
> problem even if you set up a connection to Server A, received the
> catalog from Server A, but then a subsequent file request is answered
> by Server B, which may have no idea what environment your client is,
> given that the environment isn't encoded in the puppet:/// source URI.

Well, the environment doesn't matter for the files, because it's
included in the URI itself.

>> I'm thinking that it might be a better idea to solve this problem
>> than
>> to hack around it. The main solution I'm thinking of is essentially
>> requiring some kind of shared back-end or requiring a shared cache
>> such as memcached.
>>
>> A shared cache with memcached should be pretty close to trivial -
>> just
>> another terminus type. This obviously adds another dependency, but
>> only in those cases where you 1) have multiple masters, 2) don't have
>> client binding to an individual master, and 3) aren't using some
>> common back-end (one of which will be available from us with this
>> information by the next major release).
>>
>> Is this a reasonable approach? It's obviously not sufficient for
>> 0.25.2, but I think it's a better long term direction.
>
> So for me this all depends on how well it scales... happy to do tests.
>
> Is it perhaps feasible to have the server tell the client to resubmit
> the fact values to it if it doesn't have a fresh cache? I have this
> nagging feeling I already brought up this in the past and it wasn't
> feasible.


You have the same problem, don't you? If we push new data, it could
again end up at the wrong host.

--
A person's maturity consists in having found again the seriousness one
had as a child, at play. --Friedrich Nietzsche
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com

Markus Roberts

unread,
Nov 25, 2009, 3:38:28 PM11/25/09
to puppet-dev
One possibility we're overlooking here (I'm not making any claims apart from the fact that it's a distinct solution) is to bind a run to a server on the initial exchange (e.g. do a redirect from the generic "puppetmaster pool" URL to an equivalent but more specific "the particular puppetmaster who's handling you for this run" URL).  Session based web services sometimes use this technique.


Nigel Kersten

unread,
Nov 25, 2009, 3:53:43 PM11/25/09
to puppe...@googlegroups.com
On Wed, Nov 25, 2009 at 12:21 PM, Luke Kanies <lu...@madstop.com> wrote:

> Well, the environment doesn't matter for the files, because it's
> included in the URI itself.

But not in localconfig.yaml right? So if you fail on your initial
connection, and continue to apply a cached catalog, don't you fall
back to the same problem?

>>> Is this a reasonable approach?  It's obviously not sufficient for
>>> 0.25.2, but I think it's a better long term direction.
>>
>> So for me this all depends on how well it scales... happy to do tests.
>>
>> Is it perhaps feasible to have the server tell the client to resubmit
>> the fact values to it if it doesn't have a fresh cache? I have this
>> nagging feeling I already brought up this in the past and it wasn't
>> feasible.
>
>
> You have the same problem, don't you?  If we push new data, it could
> again end up at the wrong host.

I assumed if a specific server requested the info, it would be
delivered to that server.

Luke Kanies

unread,
Nov 25, 2009, 4:15:30 PM11/25/09
to puppe...@googlegroups.com
On Nov 25, 2009, at 12:53 PM, Nigel Kersten wrote:

> On Wed, Nov 25, 2009 at 12:21 PM, Luke Kanies <lu...@madstop.com>
> wrote:
>
>> Well, the environment doesn't matter for the files, because it's
>> included in the URI itself.
>
> But not in localconfig.yaml right? So if you fail on your initial
> connection, and continue to apply a cached catalog, don't you fall
> back to the same problem?

I'm not actually sure. My point was that the server doesn't need to
know the client's env for the client to do a call, as long as the
client knows it. The problem with the facts, and I believe this is
unique to the fact/catalog coupling, is that the compiling server
needs access to the client's facts, and we'd prefer to do that over
two queries rather than one.

>>>> Is this a reasonable approach? It's obviously not sufficient for
>>>> 0.25.2, but I think it's a better long term direction.
>>>
>>> So for me this all depends on how well it scales... happy to do
>>> tests.
>>>
>>> Is it perhaps feasible to have the server tell the client to
>>> resubmit
>>> the fact values to it if it doesn't have a fresh cache? I have this
>>> nagging feeling I already brought up this in the past and it wasn't
>>> feasible.
>>
>>
>> You have the same problem, don't you? If we push new data, it could
>> again end up at the wrong host.
>
> I assumed if a specific server requested the info, it would be
> delivered to that server.


That only works, I believe, for people who use DNS round robin or its
equivalents, rather than people who use a load balancer, right?

--
Normal is getting dressed in clothes that you buy for work and driving
through traffic in a car that you are still paying for - in order to
get to the job you need to pay for the clothes and the car, and the
house you leave vacant all day so you can afford to live in it.
-- Ellen DeGeneres

Luke Kanies

unread,
Nov 25, 2009, 4:16:48 PM11/25/09
to puppe...@googlegroups.com
I'm amenable but I've no idea how common/supportable this is. Is this
often how load balancers work? I'd expect that if someone wants to
throw up an F5 in front of their masters that the F5 would be the only
route through to the masters, and I'd (somewhat naïvely) expect there
not to be another route to the masters.

--
Brand's Asymmetry:
The past can only be known, not changed. The future can only be
changed, not known.

Nigel Kersten

unread,
Nov 25, 2009, 4:18:53 PM11/25/09
to puppe...@googlegroups.com
On Wed, Nov 25, 2009 at 1:16 PM, Luke Kanies <lu...@madstop.com> wrote:
> On Nov 25, 2009, at 12:38 PM, Markus Roberts wrote:
>
>> One possibility we're overlooking here (I'm not making any claims
>> apart from the fact that it's a distinct solution) is to bind a run
>> to a server on the initial exchange (e.g. do a redirect from the
>> generic "puppetmaster pool" URL to an equivalent but more specific
>> "the particular puppetmaster who's handling you for this run" URL).
>> Session based web services sometimes use this technique.
>
>
> I'm amenable but I've no idea how common/supportable this is.  Is this
> often how load balancers work?  I'd expect that if someone wants to
> throw up an F5 in front of their masters that the F5 would be the only
> route through to the masters, and I'd (somewhat naïvely) expect there
> not to be another route to the masters.

That's certainly been the case in some load balancing environments
I've worked with, but others (and this is how we're planning to deal
with Puppet) will load balance on one port but still allow another
route directly to individual backends on another port.



--
nigel

Luke Kanies

unread,
Nov 25, 2009, 6:00:00 PM11/25/09
to puppe...@googlegroups.com
How, on the server, do we differentiate between the two addresses or
whatever? How would one normally tell the server which name or IP it
should provide clients?

--
The remarkable thing about Shakespeare is that he really is very good,
in spite of all the people who say he is very good. -- Robert Graves

David Lutterkort

unread,
Nov 25, 2009, 6:39:36 PM11/25/09
to puppe...@googlegroups.com
On Wed, 2009-11-25 at 13:16 -0800, Luke Kanies wrote:
> On Nov 25, 2009, at 12:38 PM, Markus Roberts wrote:
>
> > One possibility we're overlooking here (I'm not making any claims
> > apart from the fact that it's a distinct solution) is to bind a run
> > to a server on the initial exchange (e.g. do a redirect from the
> > generic "puppetmaster pool" URL to an equivalent but more specific
> > "the particular puppetmaster who's handling you for this run" URL).
> > Session based web services sometimes use this technique.
>
>
> I'm amenable but I've no idea how common/supportable this is. Is this
> often how load balancers work? I'd expect that if someone wants to
> throw up an F5 in front of their masters that the F5 would be the only
> route through to the masters, and I'd (somewhat naᅵvely) expect there
> not to be another route to the masters.

Webapps usually get around that with 'sticky' loadbalancing -
essentially, the loadbalancer can be told to look for a cookie and/or
request parameter in the request and then makes sure that requests with
the same cookie value always get routed to the same server.

In the Java world, that's what the infamous jsessionid request parameter
and cookie are for.

David


Brice Figureau

unread,
Nov 26, 2009, 3:47:38 AM11/26/09
to puppe...@googlegroups.com
On Wed, 2009-11-25 at 11:50 -0800, Luke Kanies wrote:
> Looking at the problems resulting from us putting facts in the get
> request, e.g. #2855, makes me again think this is the wrong approach.

It was kludgy at best, and I even remember warning you about that prior
to the 0.25 release.

> This ability only exists because, when running multiple servers
> without client binding, there's a chance that the facts get sent to a
> different server than the catalog is retrieved from.
>
> I'm thinking that it might be a better idea to solve this problem than
> to hack around it. The main solution I'm thinking of is essentially
> requiring some kind of shared back-end or requiring a shared cache
> such as memcached.

My concern here is that we don't want to cache the facts, we want to
have them at disposal to several masters. The issue with caching is that
you're never sure the content will be there (that's even the whole point
of cache).
So what happens if memcached decides to purge the facts for a given host
and said host asks for a catalog?

What we need is a (more) persistent shared storage for this. And the
only one we have at the moment is storeconfigs/thin_storeconfigs.
Granted those are performance suckers (less of course for
thin_storeconfigs), so that might not be usefull for large sites (which
of course needs several masters).

> A shared cache with memcached should be pretty close to trivial - just
> another terminus type. This obviously adds another dependency, but
> only in those cases where you 1) have multiple masters, 2) don't have
> client binding to an individual master, and 3) aren't using some
> common back-end (one of which will be available from us with this
> information by the next major release).
>
> Is this a reasonable approach? It's obviously not sufficient for
> 0.25.2, but I think it's a better long term direction.

I see several possibilities:
* we bend the REST model to POST the facts and get the catalog as a
result (ie one transaction like now, but posted)

* we HTTP pipeline the facts posting and the catalog get in the same
stream (not sure LB wouldn't split the request in piece and direct those
to different upstream masters).

* we ask users to use a LB with an client ip hash load balancing scheme
(ie client are sent always to the same master).

* we implement a master to master protocol (or a ring ala spread or
using a message queue/topic). If one client asks for a catalog to master
A, A contacts the other masters for the one having the freshest facts,
compiles and sends back the catalog.

* we don't care and ask users wanting to have multiple masters to use a
shared filesystem (whatever it is) to share the yaml dumped facts.

Choose your poison :-)
--
Brice Figureau
Follow the latest Puppet Community evolutions on www.planetpuppet.org!

Thomas Bellman

unread,
Nov 26, 2009, 5:05:42 AM11/26/09
to puppe...@googlegroups.com
Luke Kanies wrote:

> The problem with the facts, and I believe this is
> unique to the fact/catalog coupling, is that the compiling server
> needs access to the client's facts, and we'd prefer to do that over
> two queries rather than one.

Is two queries really better than one? Remembering the facts for
a client for an indeterminate time sounds to me like it would make
puppetmasterd more complicated.

From what I can remember and understood from the earlier discussion,
the only reason for having two queries would be to work around a
limitation when using GET with some webservers, and the reason we
use GET is because the REST paradigm tells us do that. But does
that rule really give us some advantage, or is it just an ivory
tower proclamation that you are unclean if you don't?


/Bellman

Nigel Kersten

unread,
Nov 26, 2009, 10:49:49 AM11/26/09
to puppe...@googlegroups.com
On Thu, Nov 26, 2009 at 12:47 AM, Brice Figureau
<brice-...@daysofwonder.com> wrote:

> * we ask users to use a LB with an client ip hash load balancing scheme
> (ie client are sent always to the same master).

As far as I can see it, this would preclude providing failover in the
middle of a puppet run.

Markus Roberts

unread,
Nov 26, 2009, 3:23:07 PM11/26/09
to puppet-dev
>> I'm thinking that it might be a better idea to solve this problem than
>> to hack around it.  The main solution I'm thinking of is essentially
>> requiring some kind of shared back-end or requiring a shared cache
>> such as memcached.
>
> My concern here is that we don't want to cache the facts, we want to
> have them at disposal to several masters. The issue with caching is that
> you're never sure the content will be there (that's even the whole point
> of cache).
> So what happens if memcached decides to purge the facts for a given host
> and said host asks for a catalog?

Technically, yes, but Luke's broader point stands; there are a number
of solutions (e.g. memcachedb and tokyo tyrant) that use the memcached
protocol but are persistent.

> What we need is a (more) persistent shared storage for this. And the
> only one we have at the moment is storeconfigs/thin_storeconfigs.

As Luke noted, memcached (and other system that use the protocol)
should be quite doable; there are also a slew of other options (such
as Maglev and the other nosql systems). This would be a prime
candidate for plugins.

> Granted those are performance suckers (less of course for
> thin_storeconfigs), so that might not be usefull for large sites (which
> of course needs several masters).

I suspect that the performance issues are resolvable.

>> A shared cache with memcached should be pretty close to trivial - just
>> another terminus type.  This obviously adds another dependency, but
>> only in those cases where you 1) have multiple masters, 2) don't have
>> client binding to an individual master, and 3) aren't using some
>> common back-end (one of which will be available from us with this
>> information by the next major release).

Having an additional dependency for an optional feature seems quite reasonable.

> * we bend the REST model to POST the facts and get the catalog as a
> result (ie one transaction like now, but posted)

Sure. I mean, if you're willing to contort HTTP and pretend it's a
RPC system (which is what REST is), a little extra bending to make it
actually work shouldn't be that objectionable. Are there any "REST
purest" on this bus, and if so have you thought about how paradoxical
that is? If we're all pragmatist here, this may be the simplest/most
reliable solution.

> * we HTTP pipeline the facts posting and the catalog get in the same
> stream (not sure LB wouldn't split the request in piece and direct those
> to different upstream masters).

I have low confidence in this working, if only based on the number of
ways I can imagine it going wrong.

> * we ask users to use a LB with an client ip hash load balancing scheme
> (ie client are sent always to the same master).

That could work, though it makes rollover more granular and may
require a more sophisticated LB setup. If puppetmasters can come on
and off line without requiring a system-wide hiatus, the LB is going
to have to be pretty savvy.

> * we implement a master to master protocol (or a ring ala spread or
> using a message queue/topic). If one client asks for a catalog to master
> A, A contacts the other masters for the one having the freshest facts,
> compiles and sends back the catalog.

Message bus style, perhaps (they all listen and cache everything), but
any time you tell your systems to form a committee performance
plummets--if we assume that each puppetmaster can serve up to k
clients we've taken it from something that scales O(n) or O(n log n)
to something that's O(n^2) or worse. How do you deal with
slow-responding peers? Does A keep a copy? How do you deal with
server failure (e.g. the machine with the most current copy goes
down)?

> * we don't care and ask users wanting to have multiple masters to use a
> shared filesystem (whatever it is) to share the yaml dumped facts.

It could work. It could also fail due to various race conditions.

> Choose your poison :-)

*smile* I did. That's how I wound up here.

-- Markus

Frank Sweetser

unread,
Nov 26, 2009, 8:02:17 PM11/26/09
to puppe...@googlegroups.com, Markus Roberts
On 11/26/2009 3:23 PM, Markus Roberts wrote:

>> * we ask users to use a LB with an client ip hash load balancing scheme
>> (ie client are sent always to the same master).
>
> That could work, though it makes rollover more granular and may
> require a more sophisticated LB setup. If puppetmasters can come on
> and off line without requiring a system-wide hiatus, the LB is going
> to have to be pretty savvy.

On the couple of commercial grade load balancers I've used (mostly F5) this
kind of sophistication is standard fare. Typically you define a functional
health check (open a TCP connection, do an HTTP request, that sort of thing)
that is used to determine if a given server is functional. If it fails the
health check, the server is pulled out of the pool until it starts passing
again, and all of its clients get redirected to new backend servers.

--
Frank Sweetser fs at wpi.edu | For every problem, there is a solution that
WPI Senior Network Engineer | is simple, elegant, and wrong. - HL Mencken
GPG fingerprint = 6174 1257 129E 0D21 D8D4 E8A3 8E39 29E3 E2E8 8CEC

Markus

unread,
Nov 26, 2009, 8:36:41 PM11/26/09
to puppe...@googlegroups.com

> >> * we ask users to use a LB with an client ip hash load balancing scheme
> >> (ie client are sent always to the same master).
> >
> > That could work, though it makes rollover more granular and may
> > require a more sophisticated LB setup. If puppetmasters can come on
> > and off line without requiring a system-wide hiatus, the LB is going
> > to have to be pretty savvy.
>
> On the couple of commercial grade load balancers I've used (mostly F5) this
> kind of sophistication is standard fare. Typically you define a functional
> health check (open a TCP connection, do an HTTP request, that sort of thing)
> that is used to determine if a given server is functional. If it fails the
> health check, the server is pulled out of the pool until it starts passing
> again, and all of its clients get redirected to new backend servers.

The problem is we're talking about using it to maintain a stateful
association between the client and the puppetmaster. If a client sends
its facts to a server before it goes down, the backup server isn't going
to have access to those facts; likewise, if a client sends its facts to
a backup server right before the primary comes back on line. And so on.

The problem isn't determining if a server is functional (that part, as
you note, is standard) it's dealing with the statefulness of client/
server bindings and doing the right thing when those change.

-- Markus


Frank Sweetser

unread,
Nov 26, 2009, 8:59:39 PM11/26/09
to puppe...@googlegroups.com, Markus
Ah, I understand what you mean now. At least in the case of F5 load
balancers, I'm not sure if it's the default for source IP session persistence,
but it's quite possible to set them up so that the source IP is used as a key
into a persistent state table that maintains the client to backend server
mappings. HTTP session persistence is a pretty common problem in the load
balancing world, so as long as puppet clearly spells out its requirements, it
should be possible to set things up to meet them.

Alternatively, I haven't looked deeply enough, but if puppet uses a single TCP
socket for the entire lifetime of a run, then you could set up the load
balancer to operate on the TCP layer instead of digging into the HTTP(S)
layer. That way, since the load balancer sees everything as a single atomic
TCP session, there's no possibility for it to send different requests to
different backend servers within a single run.

Nigel Kersten

unread,
Nov 28, 2009, 11:30:04 AM11/28/09
to puppe...@googlegroups.com
On Wed, Nov 25, 2009 at 11:50 AM, Luke Kanies <lu...@reductivelabs.com> wrote:
Looking at the problems resulting from us putting facts in the get
request, e.g. #2855, makes me again think this is the wrong approach.

I don't understand why that bug is such an issue to be honest. 

Why not just make zlib a requirement for Puppet?

Luke Kanies

unread,
Nov 28, 2009, 1:18:01 PM11/28/09
to puppe...@googlegroups.com
In our case it's not so much that we have an ivory tower but that we
have a system implemented around GET, with no real provision for ever
using POST. Not that it's impossible, but it'd be a one-off for both
client and server, or it would drastically complicate the model we use
for passing information around the network.

Hmm, well, maybe not drastically; I suppose we could have an argument
that causes the equivalent of a 'get' to be returned as the result of
the original call. That's still a significant change -- would we have
to change our 'put' to a 'post'? -- but not untenable, I think.

--
The great aim of education is not knowledge but action.
-- Herbert Spencer

Luke Kanies

unread,
Nov 28, 2009, 1:21:27 PM11/28/09
to puppe...@googlegroups.com
Well, we've traditionally been pretty strict on our dependencies and
have done a good job of running in a degraded mode if necessary rather
than failing without dependencies.

To have a hackish fix to a relatively poor design decision be the
reason we add a new dependency is a bit galling.

--
The most dangerous strategy is to jump a chasm in two leaps.
-- Benjamin Disraeli

Luke Kanies

unread,
Nov 28, 2009, 1:30:28 PM11/28/09
to puppe...@googlegroups.com
On Nov 26, 2009, at 12:23 PM, Markus Roberts wrote:

>>> I'm thinking that it might be a better idea to solve this problem
>>> than
>>> to hack around it. The main solution I'm thinking of is essentially
>>> requiring some kind of shared back-end or requiring a shared cache
>>> such as memcached.
>>
>> My concern here is that we don't want to cache the facts, we want to
>> have them at disposal to several masters. The issue with caching is
>> that
>> you're never sure the content will be there (that's even the whole
>> point
>> of cache).
>> So what happens if memcached decides to purge the facts for a given
>> host
>> and said host asks for a catalog?
>
> Technically, yes, but Luke's broader point stands; there are a number
> of solutions (e.g. memcachedb and tokyo tyrant) that use the memcached
> protocol but are persistent.

Yep. Additionally, though, there really is a good bit of caching
going on on the server now, and doing so with memcached makes a lot
more sense in many cases.

The complication is that most of that caching is a traditional cache
-- we can get new data if it goes stale -- but the fact information
isn't really a cache in that sense.

>> What we need is a (more) persistent shared storage for this. And the
>> only one we have at the moment is storeconfigs/thin_storeconfigs.
>
> As Luke noted, memcached (and other system that use the protocol)
> should be quite doable; there are also a slew of other options (such
> as Maglev and the other nosql systems). This would be a prime
> candidate for plugins.
>
>> Granted those are performance suckers (less of course for
>> thin_storeconfigs), so that might not be usefull for large sites
>> (which
>> of course needs several masters).
>
> I suspect that the performance issues are resolvable.

Probably, although I'm not convinced it's possible to do so without
changing technologies (Brice, how's your research into TokyoCabinet et
al going?).

Really, though, to store the Fact information, the performance won't
be nearly as big a problem.

>>> A shared cache with memcached should be pretty close to trivial -
>>> just
>>> another terminus type. This obviously adds another dependency, but
>>> only in those cases where you 1) have multiple masters, 2) don't
>>> have
>>> client binding to an individual master, and 3) aren't using some
>>> common back-end (one of which will be available from us with this
>>> information by the next major release).
>
> Having an additional dependency for an optional feature seems quite
> reasonable.
>
>> * we bend the REST model to POST the facts and get the catalog as a
>> result (ie one transaction like now, but posted)
>
> Sure. I mean, if you're willing to contort HTTP and pretend it's a
> RPC system (which is what REST is), a little extra bending to make it
> actually work shouldn't be that objectionable. Are there any "REST
> purest" on this bus, and if so have you thought about how paradoxical
> that is? If we're all pragmatist here, this may be the simplest/most
> reliable solution.

AFAIK there haven't been any purist arguments in the group, and as you
say, that would be pretty silly.

My only concern is how much it requires a change to the existing
architecture or a one-off solution that's painful to maintain over time.

>> * we HTTP pipeline the facts posting and the catalog get in the same
>> stream (not sure LB wouldn't split the request in piece and direct
>> those
>> to different upstream masters).
>
> I have low confidence in this working, if only based on the number of
> ways I can imagine it going wrong.

I concur.

>> * we ask users to use a LB with an client ip hash load balancing
>> scheme
>> (ie client are sent always to the same master).
>
> That could work, though it makes rollover more granular and may
> require a more sophisticated LB setup. If puppetmasters can come on
> and off line without requiring a system-wide hiatus, the LB is going
> to have to be pretty savvy.
>
>> * we implement a master to master protocol (or a ring ala spread or
>> using a message queue/topic). If one client asks for a catalog to
>> master
>> A, A contacts the other masters for the one having the freshest
>> facts,
>> compiles and sends back the catalog.
>
> Message bus style, perhaps (they all listen and cache everything), but
> any time you tell your systems to form a committee performance
> plummets--if we assume that each puppetmaster can serve up to k
> clients we've taken it from something that scales O(n) or O(n log n)
> to something that's O(n^2) or worse. How do you deal with
> slow-responding peers? Does A keep a copy? How do you deal with
> server failure (e.g. the machine with the most current copy goes
> down)?

I think this is a complicated solution to what should not be a
complicated problem.

>> * we don't care and ask users wanting to have multiple masters to
>> use a
>> shared filesystem (whatever it is) to share the yaml dumped facts.
>
> It could work. It could also fail due to various race conditions.


This basically says that multiple masters is really complicated and
you shouldn't do it, which is not where we want to end up.

IMO, the right approach is to have a node manager capable of
functioning as an inventory server (holdiing all fact/node data), and
then have the servers query that (with the same kind of caching
they're doing now).

This gets you essentially everything you need, and all it says is: If
you want multimaster, you have to have an inventorying node manager.

Which, conveniently, we're about to put a 0.1 out of this coming
week. Well, technically it's just the node manager part, but we'll be
quickly adding the inventory bits.

--
Education is when you read the fine print. Experience is what you get
if you don't. -- Pete Seeger

Markus

unread,
Nov 28, 2009, 2:54:39 PM11/28/09
to puppe...@googlegroups.com

> In our case it's not so much that we have an ivory tower but that we
> have a system implemented around GET, with no real provision for ever
> using POST. Not that it's impossible, but it'd be a one-off for both
> client and server, or it would drastically complicate the model we use
> for passing information around the network.
>
> Hmm, well, maybe not drastically; I suppose we could have an argument
> that causes the equivalent of a 'get' to be returned as the result of
> the original call. That's still a significant change -- would we have
> to change our 'put' to a 'post'? -- but not untenable, I think.

It could be even simpler, I think.

* Switch to issuing a POST everywhere that we presently issue a GET,
with a fallback to GET if the POST is rejected (405) or even better
based on the api version for mixed system backwards compatibility.

* Switch to accepting both POST and get where we now accept only GET.

-- Markus


Brice Figureau

unread,
Nov 28, 2009, 4:57:43 PM11/28/09
to puppe...@googlegroups.com
That was one of my concern, too, and something that I always found
strange (ie storeconfigs being used in the cache part of the indirector).

>>> What we need is a (more) persistent shared storage for this. And the
>>> only one we have at the moment is storeconfigs/thin_storeconfigs.
>>
>> As Luke noted, memcached (and other system that use the protocol)
>> should be quite doable; there are also a slew of other options (such
>> as Maglev and the other nosql systems). This would be a prime
>> candidate for plugins.
>>
>>> Granted those are performance suckers (less of course for
>>> thin_storeconfigs), so that might not be usefull for large sites
>>> (which
>>> of course needs several masters).
>>
>> I suspect that the performance issues are resolvable.
>
> Probably, although I'm not convinced it's possible to do so without
> changing technologies (Brice, how's your research into TokyoCabinet et
> al going?).

I couldn't make any progress so far, my puppet work has stalled recently
because of an activity surge at the office.

> Really, though, to store the Fact information, the performance won't
> be nearly as big a problem.

Sure. We can even use thing_storeconfigs for that.

>>>> A shared cache with memcached should be pretty close to trivial -
>>>> just
>>>> another terminus type. This obviously adds another dependency, but
>>>> only in those cases where you 1) have multiple masters, 2) don't
>>>> have
>>>> client binding to an individual master, and 3) aren't using some
>>>> common back-end (one of which will be available from us with this
>>>> information by the next major release).
>>
>> Having an additional dependency for an optional feature seems quite
>> reasonable.
>>
>>> * we bend the REST model to POST the facts and get the catalog as a
>>> result (ie one transaction like now, but posted)
>>
>> Sure. I mean, if you're willing to contort HTTP and pretend it's a
>> RPC system (which is what REST is), a little extra bending to make it
>> actually work shouldn't be that objectionable. Are there any "REST
>> purest" on this bus, and if so have you thought about how paradoxical
>> that is? If we're all pragmatist here, this may be the simplest/most
>> reliable solution.
>
> AFAIK there haven't been any purist arguments in the group, and as you
> say, that would be pretty silly.
>
> My only concern is how much it requires a change to the existing
> architecture or a one-off solution that's painful to maintain over time.

I didn't really check, but I didn't think it was complex...

[snipped]
>>> * we don't care and ask users wanting to have multiple masters to
>>> use a
>>> shared filesystem (whatever it is) to share the yaml dumped facts.
>>
>> It could work. It could also fail due to various race conditions.
>
>
> This basically says that multiple masters is really complicated and
> you shouldn't do it, which is not where we want to end up.
>
> IMO, the right approach is to have a node manager capable of
> functioning as an inventory server (holdiing all fact/node data), and
> then have the servers query that (with the same kind of caching
> they're doing now).
>
> This gets you essentially everything you need, and all it says is: If
> you want multimaster, you have to have an inventorying node manager.

But people running multi master mainly do this for failure resistance,
and we're just adding a single point of failure... Don't you think this
is a problem?
--
Brice Figureau
My Blog: http://www.masterzen.fr/

Markus

unread,
Nov 28, 2009, 5:20:43 PM11/28/09
to puppe...@googlegroups.com
> > This gets you essentially everything you need, and all it says is: If
> > you want multimaster, you have to have an inventorying node manager.
>
> But people running multi master mainly do this for failure resistance,
> and we're just adding a single point of failure... Don't you think this
> is a problem?

Yeah, that was my thought too. There are times when this is still
reasonable (if you replace a complex, error prone single point of
failure with one that is simpler and more reliable you're still ahead)
but in general it's something you'd try to avoid.

On the other other hand, it's quite possible that the node manager will
be directly scalable (using, say, normal db replication) so this concern
would go away.

-- Markus





Christian Hofstaedtler

unread,
Nov 28, 2009, 5:31:23 PM11/28/09
to puppe...@googlegroups.com
* Brice Figureau <brice-...@daysofwonder.com> [091128 22:58]:
[snipped a lot]
> [snipped]
> >>> * we don't care and ask users wanting to have multiple masters to
> >>> use a
> >>> shared filesystem (whatever it is) to share the yaml dumped facts.
> >>
> >> It could work. It could also fail due to various race conditions.
> >
> >
> > This basically says that multiple masters is really complicated and
> > you shouldn't do it, which is not where we want to end up.
> >
> > IMO, the right approach is to have a node manager capable of
> > functioning as an inventory server (holdiing all fact/node data), and
> > then have the servers query that (with the same kind of caching
> > they're doing now).
> >
> > This gets you essentially everything you need, and all it says is: If
> > you want multimaster, you have to have an inventorying node manager.
>
> But people running multi master mainly do this for failure resistance,
> and we're just adding a single point of failure... Don't you think this
> is a problem?


I've not completely read this thread, but it seems like there is
confusion about what people really want. My observation is, that
people mostly want:

load sharing
and/or
fault tolerance

And both of these things are currently very complicated to do as
long as the client has only one hostname to talk to.

Why not change this?

For fault tolerance and small load sharing setups, configure a list
of servers to talk to on the client. At run-time, the client shall pick
one server at random and stick to it for the rest of the run. If the
server doesn't respond, start over with the selection mechanism.

For setups with more servers/masters (I expect that people probably
don't want to configure >2 masters in their client configs), the
initially contacted master may hand out a name of a master to talk
to. This is basically the same as suggested already very early in
this thread, but it may get combined with a node manager (whatever
that one will do).

Doing both things you can get load sharing and fault tolerance (for
single masters and the node manager).


Christian

--
christian hofstaedtler

Ohad Levy

unread,
Nov 29, 2009, 9:43:08 AM11/29/09
to puppe...@googlegroups.com
On Sun, Nov 29, 2009 at 6:31 AM, Christian Hofstaedtler <c...@zeha.at> wrote:
 load sharing
and/or
 fault tolerance

And both of these things are currently very complicated to do as
long as the client has only one hostname to talk to.

Why not change this?
 
+1 - I think its acceptable for each client to connect to one server and keep on using that server for its whole "puppet run".

a simple solution might be to implement a DNS SRV record (e.g. like LDAP)  which allows the client to decide to which puppetmaster he would like to connect to.
this in time could be enhanced to get the server load etc (so it could try to use another server or to wait for a while).
 
I would be happy not to add any additional depedencies (even though memcache is acceptable) - a specially a database, e.g. if I have 5 locations where i need HA + load sharing, i don't want to end up maintaining 5 set of clusters.

my 2 cents,
Ohad

Nigel Kersten

unread,
Nov 29, 2009, 1:45:09 PM11/29/09
to puppe...@googlegroups.com
On Sun, Nov 29, 2009 at 6:43 AM, Ohad Levy <ohad...@gmail.com> wrote:


On Sun, Nov 29, 2009 at 6:31 AM, Christian Hofstaedtler <c...@zeha.at> wrote:
 load sharing
and/or
 fault tolerance

And both of these things are currently very complicated to do as
long as the client has only one hostname to talk to.

Why not change this?
 
+1 - I think its acceptable for each client to connect to one server and keep on using that server for its whole "puppet run".

I think it's perfectly reasonable, and would make failover a lot simpler.

Do we have all the information internally to tell when an error indicates that a server is unavailable?

Would we give up and consider it a failure every time we can't find a puppet:/// file resource? So we'd be changing behavior when someone typos a puppet URI ? Should the behavior be different if we time out retrieving rather than not being able to find it at all?

However, does this really help load balancing?

Say one server in a pair is overloaded and timing out on file resources... do clients simply start their run all over again with the other server? That seems kind of inefficient.... given that they may have all progressed quite far into their run.

I'd really like to be able to combine both. Shared state for load balanced pairs, multiple servers in the client config for failover and restarting the current run.
 

a simple solution might be to implement a DNS SRV record (e.g. like LDAP)  which allows the client to decide to which puppetmaster he would like to connect to.
this in time could be enhanced to get the server load etc (so it could try to use another server or to wait for a while).

This is essentially what we're doing now. We have simple monitoring in place so all our clients can check the load of the puppet server their DNS view points to, and fall back to an alternate server if the load is too high.
 
 
I would be happy not to add any additional depedencies (even though memcache is acceptable) - a specially a database, e.g. if I have 5 locations where i need HA + load sharing, i don't want to end up maintaining 5 set of clusters.

++ 

I can't see a way around shared state for efficient load balancing, but think that being able to provide a list of puppet servers to the clients would greatly help with failover.
 

my 2 cents,
Ohad

Luke Kanies

unread,
Nov 30, 2009, 2:00:16 AM11/30/09
to puppe...@googlegroups.com
On Nov 28, 2009, at 1:57 PM, Brice Figureau wrote:

>> This basically says that multiple masters is really complicated and
>> you shouldn't do it, which is not where we want to end up.
>>
>> IMO, the right approach is to have a node manager capable of
>> functioning as an inventory server (holdiing all fact/node data), and
>> then have the servers query that (with the same kind of caching
>> they're doing now).
>>
>> This gets you essentially everything you need, and all it says is:
>> If
>> you want multimaster, you have to have an inventorying node manager.
>
> But people running multi master mainly do this for failure resistance,
> and we're just adding a single point of failure... Don't you think
> this
> is a problem?


Yeah, it is, I guess. My plan was always to punt with "scale it like
you would any rails app", but i know that's a cop-out.

You're right that, in the end, the structure of the interactions
shouldn't require that separate system.

--
There are three kinds of death in this world. There's heart death,
there's brain death, and there's being off the network. -- Guy Almes

Luke Kanies

unread,
Nov 30, 2009, 2:06:53 AM11/30/09
to puppe...@googlegroups.com
On Nov 29, 2009, at 6:43 AM, Ohad Levy wrote:

>
>
> On Sun, Nov 29, 2009 at 6:31 AM, Christian Hofstaedtler <c...@zeha.at>
> wrote:
> load sharing
> and/or
> fault tolerance
>
> And both of these things are currently very complicated to do as
> long as the client has only one hostname to talk to.
>
> Why not change this?
>
> +1 - I think its acceptable for each client to connect to one server
> and keep on using that server for its whole "puppet run".

I agree, although I'd simplify it and say, any connection mechanism
should be connected to the server selection mechanism such that
certain failures result in a reselection.

> a simple solution might be to implement a DNS SRV record (e.g. like
> LDAP) which allows the client to decide to which puppetmaster he
> would like to connect to.
> this in time could be enhanced to get the server load etc (so it
> could try to use another server or to wait for a while).

I agree, although this involves building, essentially, a server
selection subsystem, which anyone who uses the server name would use
to pick the server. We can expand it over time, but I'd certainly
like to start simply - support a list, and pick a new one if the
current one starts throwing errors.

> I would be happy not to add any additional depedencies (even though
> memcache is acceptable) - a specially a database, e.g. if I have 5
> locations where i need HA + load sharing, i don't want to end up
> maintaining 5 set of clusters.


I agree, and that was certainly one of my concerns.

--
Do you realize if it weren't for Edison we'd be watching TV by
candlelight? -- Al Boliska

Luke Kanies

unread,
Nov 30, 2009, 2:09:30 AM11/30/09
to puppe...@googlegroups.com
On Nov 29, 2009, at 10:45 AM, Nigel Kersten wrote:

>
>
> On Sun, Nov 29, 2009 at 6:43 AM, Ohad Levy <ohad...@gmail.com> wrote:
>
>
> On Sun, Nov 29, 2009 at 6:31 AM, Christian Hofstaedtler <c...@zeha.at>
> wrote:
> load sharing
> and/or
> fault tolerance
>
> And both of these things are currently very complicated to do as
> long as the client has only one hostname to talk to.
>
> Why not change this?
>
> +1 - I think its acceptable for each client to connect to one server
> and keep on using that server for its whole "puppet run".
>
> I think it's perfectly reasonable, and would make failover a lot
> simpler.
>
> Do we have all the information internally to tell when an error
> indicates that a server is unavailable?

Hah, I doubt it.

> Would we give up and consider it a failure every time we can't find
> a puppet:/// file resource? So we'd be changing behavior when
> someone typos a puppet URI ? Should the behavior be different if we
> time out retrieving rather than not being able to find it at all?

Urgh, no way. The connection itself needs to fail, not just have some
random exception - probably, anything other than a timeout wouldn't
constitute a failure.

> However, does this really help load balancing?
>
> Say one server in a pair is overloaded and timing out on file
> resources... do clients simply start their run all over again with
> the other server? That seems kind of inefficient.... given that they
> may have all progressed quite far into their run.

With the 'retry' functionality in ruby, the caller never knows of a
problem unless none of the servers work. You pick a new server,
reconnect, and keep going. I can't think of anything that would
reasonably restart the whole run itself.

> I'd really like to be able to combine both. Shared state for load
> balanced pairs, multiple servers in the client config for failover
> and restarting the current run.
>
>
> a simple solution might be to implement a DNS SRV record (e.g. like
> LDAP) which allows the client to decide to which puppetmaster he
> would like to connect to.
> this in time could be enhanced to get the server load etc (so it
> could try to use another server or to wait for a while).
>
> This is essentially what we're doing now. We have simple monitoring
> in place so all our clients can check the load of the puppet server
> their DNS view points to, and fall back to an alternate server if
> the load is too high.
>
>
> I would be happy not to add any additional depedencies (even though
> memcache is acceptable) - a specially a database, e.g. if I have 5
> locations where i need HA + load sharing, i don't want to end up
> maintaining 5 set of clusters.
>
> ++
>
> I can't see a way around shared state for efficient load balancing,
> but think that being able to provide a list of puppet servers to the
> clients would greatly help with failover.


I agree. Any volunteers? :)

Especially since rowlf was supposed to be out this quarter but it
looks like we'll be releasing 0.25.2 on the same timeframe instead.

--
The great tragedy of Science - the slaying of a beautiful hypothesis by
an ugly fact. --Thomas H. Huxley

Luke Kanies

unread,
Nov 30, 2009, 2:11:33 AM11/30/09
to puppe...@googlegroups.com
I'm comfortable with this, and I like the idea of the server being
agnostic about post/get.

It only gets us halfway, though - we need a POST to return the results
of another query entirely (i.e., the POST of the Facts should return
the results of a GET of a Catalog). How would we do that?

--
Tradition is what you resort to when you don't have the time or the
money to do it right. -- Kurt Herbert Alder

Paul Nasrat

unread,
Nov 30, 2009, 3:41:05 AM11/30/09
to puppe...@googlegroups.com
2009/11/30 Luke Kanies <lu...@madstop.com>:
> On Nov 28, 2009, at 11:54 AM, Markus wrote:
>
>>
>>> In our case it's not so much that we have an ivory tower but that we
>>> have a system implemented around GET, with no real provision for ever
>>> using POST.  Not that it's impossible, but it'd be a one-off for both
>>> client and server, or it would drastically complicate the model we
>>> use
>>> for passing information around the network.
>>>
>>> Hmm, well, maybe not drastically; I suppose we could have an argument
>>> that causes the equivalent of a 'get' to be returned as the result of
>>> the original call.  That's still a significant change -- would we
>>> have
>>> to change our 'put' to a 'post'? -- but not untenable, I think.
>>
>> It could be even simpler, I think.
>>
>> * Switch to issuing a POST everywhere that we presently issue a GET,
>> with a fallback to GET if the POST is rejected (405) or even better
>> based on the api version for mixed system backwards compatibility.
>>
>> * Switch to accepting both POST and get where we now accept only GET.
>
>
> I'm comfortable with this, and I like the idea of the server being
> agnostic about post/get.
>
> It only gets us halfway, though - we need a POST to return the results
> of another query entirely (i.e., the POST of the Facts should return
> the results of a GET of a Catalog).  How would we do that?

It's worth thinking about what HTTP gives us here to do things like
this. One option would be to respond to the post with a temporary
redirect to the catalog that the client then GET's.

It's probably worth thinking about how the REST API works at the HTTP
level as well if you want to do the "scale it like any rails app".
There are a lot of things we can do (If-Modified-Since, Etag, Expires)
to handle both metadata and caching rules.

Paul

Paul Nasrat

unread,
Nov 30, 2009, 3:53:11 AM11/30/09
to puppe...@googlegroups.com
2009/11/30 Paul Nasrat <pna...@googlemail.com>:
I'm just reading Jim Webber's REST tutorial

http://jim.webber.name/2009/11/20/8eae595a-d1d2-4f4f-87f6-f67280013176.aspx

A more condensed article is here:

http://www.infoq.com/articles/webber-rest-workflow

I'm not through it yet as it's a large deck but looking at it

POST /url/to/post/facts

Could return a HTTP 201 with a Location:

I'd really like us to write up our REST API interactions for puppet so
the flow is documented.

Paul

Thomas Bellman

unread,
Nov 30, 2009, 5:17:53 AM11/30/09
to puppe...@googlegroups.com
Paul Nasrat wrote:

> 2009/11/30 Paul Nasrat <pna...@googlemail.com>:

>> It's worth thinking about what HTTP gives us here to do things like
>> this. One option would be to respond to the post with a temporary
>> redirect to the catalog that the client then GET's.

> I'm just reading Jim Webber's REST tutorial
>
> http://jim.webber.name/2009/11/20/8eae595a-d1d2-4f4f-87f6-f67280013176.aspx
>
> A more condensed article is here:
>
> http://www.infoq.com/articles/webber-rest-workflow
>
> I'm not through it yet as it's a large deck but looking at it
>
> POST /url/to/post/facts
>
> Could return a HTTP 201 with a Location:

But then you are back to two requests! What would be the advantage
of that?

It's perfectly allowable to return content on a POST. There's no
need for Location:.


/Bellman

Paul Nasrat

unread,
Nov 30, 2009, 5:25:58 AM11/30/09
to puppe...@googlegroups.com
2009/11/30 Thomas Bellman <bel...@nsc.liu.se>:
I guess it depends if you want to be able for a client to refer to a
resource that represents the catalog at a point in time. This would
potentially allow for the ability to expresss the catalog hasn't
changed, etc. It might be a personal preference but it feels to me to
be a richer representation of an API.

Paul

Ramon van Alteren

unread,
Nov 30, 2009, 5:31:32 AM11/30/09
to puppe...@googlegroups.com
On Wed, Nov 25, 2009 at 03:39:36PM -0800, David Lutterkort wrote:
> On Wed, 2009-11-25 at 13:16 -0800, Luke Kanies wrote:
> > On Nov 25, 2009, at 12:38 PM, Markus Roberts wrote:
> >
> > > One possibility we're overlooking here (I'm not making any claims
> > > apart from the fact that it's a distinct solution) is to bind a run
> > > to a server on the initial exchange (e.g. do a redirect from the
> > > generic "puppetmaster pool" URL to an equivalent but more specific
> > > "the particular puppetmaster who's handling you for this run" URL).
> > > Session based web services sometimes use this technique.
> >
> >
> > I'm amenable but I've no idea how common/supportable this is. Is this
> > often how load balancers work? I'd expect that if someone wants to
> > throw up an F5 in front of their masters that the F5 would be the only
> > route through to the masters, and I'd (somewhat naïvely) expect there
> > not to be another route to the masters.
>
> Webapps usually get around that with 'sticky' loadbalancing -
> essentially, the loadbalancer can be told to look for a cookie and/or
> request parameter in the request and then makes sure that requests with
> the same cookie value always get routed to the same server.

Actually most loadbalancers geared for high-performance use the sticky
approach but do NOT parse anything in the request. The stickyness (stay
with the same realserver in the pool) is implemented by looking at the
ipaddress from the client doing the request and routing that to the same
realserver in the pool on subsequent requests within a timeout.

Request parsing is prohibitively expensive and avoided, anything that can
be determined by staying within layer-3 tcp/ip is preferable from
performance point of view.

This is true for F5 loadbalancers I believe (depending on config),
it is most certainly true for linux-ipvs based loadbalancing
which we deploy.

> In the Java world, that's what the infamous jsessionid request parameter
> and cookie are for.

Different kind of loadbalancing, more proxy-alike and probably handled by
the applicationserver or a semi-initelligent http frontend (nginx/apache)

Regards,

Ramon van Alteren

--

Hyves System Engineering

Frederiksplein 42 | 1017 XN Amsterdam
T + 31 (0)206242081 | F +31 (0)207508329 ra...@hyves.nl | ramon71.hyves.nl | www.hyves.nl

Thomas Bellman

unread,
Nov 30, 2009, 5:31:58 AM11/30/09
to puppe...@googlegroups.com
Nigel Kersten wrote:

> I can't see a way around shared state for efficient load balancing, but
> think that being able to provide a list of puppet servers to the clients
> would greatly help with failover.

No state at all in the server scales better than having state, because
without any server-side state you don't need to share it between servers.
Similar to how a parallell program that doesn't need to communicate
between the threads scales better than one that does need communication.

(And in this case a pure cache doesn't count as state, because the server
can throw it away and still give identical answers to the clients).

In some cases no server-side state means that you have to avoid some
features. But in this particular case (keeping the receiving of facts
and the generating and sending of the catalog in the same RPC request),
there's no loss of functionality. Storedconfigs is a different thing
of course, but that's not the question here.


/Bellman

Christian Hofstaedtler

unread,
Nov 30, 2009, 5:32:27 AM11/30/09
to puppe...@googlegroups.com
* Luke Kanies <lu...@madstop.com> [091130 08:10]:
I had a simpler mechanism in mind:

Do the server selection only once per run. See if it "works" or
outright fails, and then stick to this server. ***

If the server fails in the middle of the current run, you get a
failed run, and the next one will (hopefully) work again.

This way you also don't need to worry _that much_ that all manifests
and files are completely in sync across all servers _all the time_ -
which is another problem not easily solved.


*** If there is an "intelligent" node manager in between, the server
selection is done twice, in this order:

* client picks an initial server to talk to (FT)
* client asks this server (which is really the node manager), whom
to talk to (LB)
* client sends facts to this server (and sticks to it 'til the
end/failure of the run)

> > I'd really like to be able to combine both. Shared state for load
> > balanced pairs, multiple servers in the client config for failover
> > and restarting the current run.
> >
> >
> > a simple solution might be to implement a DNS SRV record (e.g. like
> > LDAP) which allows the client to decide to which puppetmaster he
> > would like to connect to.
> > this in time could be enhanced to get the server load etc (so it
> > could try to use another server or to wait for a while).
> >
> > This is essentially what we're doing now. We have simple monitoring
> > in place so all our clients can check the load of the puppet server
> > their DNS view points to, and fall back to an alternate server if
> > the load is too high.
> >
> >
> > I would be happy not to add any additional depedencies (even though
> > memcache is acceptable) - a specially a database, e.g. if I have 5
> > locations where i need HA + load sharing, i don't want to end up
> > maintaining 5 set of clusters.
> >
> > ++
> >
> > I can't see a way around shared state for efficient load balancing,
> > but think that being able to provide a list of puppet servers to the
> > clients would greatly help with failover.

You don't really need shared state for LB, as long as the client
sticks to a server _and_ it does not reconnect to another server
during the same run.


Christian

--
christian hofstaedtler

Markus

unread,
Nov 30, 2009, 11:27:52 AM11/30/09
to puppe...@googlegroups.com
> >
> > It could be even simpler, I think.
> >
> > * Switch to issuing a POST everywhere that we presently issue a GET,
> > with a fallback to GET if the POST is rejected (405) or even better
> > based on the api version for mixed system backwards compatibility.
> >
> > * Switch to accepting both POST and get where we now accept only GET.
>
>
> I'm comfortable with this, and I like the idea of the server being
> agnostic about post/get.
>
> It only gets us halfway, though - we need a POST to return the results
> of another query entirely (i.e., the POST of the Facts should return
> the results of a GET of a Catalog). How would we do that?

I'm envisioning something much simpler. We presently map requests on to
HTTP GETs; instead, we should map them onto HTTP POSTs. That means
we've got unlimited payload size in either direction, but otherwise
nothing changes.

-- Markus

P.S. If you want to be real silly and slavishly adhere to the pretense
that the RPC -> HTTP mapping is valid, meaningful, and the names line up
perfectly you could respond to the POST with a redirection header.

Markus Roberts

unread,
Nov 30, 2009, 12:45:23 PM11/30/09
to puppet-dev
> P.S. If you want to be real silly and slavishly adhere to the pretense
> that the RPC -> HTTP mapping is valid, meaningful, and the names line up
> perfectly you could respond to the POST with a redirection header.

This P.S. to my prior post was pre-coffee; doing redirection
eliminates the advantage of switching to POST.

So the correct answer is, map requests to POST instead of GET, but
with the aforementioned flexibility to retain backward compatibility.

-- Markus

Luke Kanies

unread,
Nov 30, 2009, 1:33:06 PM11/30/09
to puppe...@googlegroups.com
On Nov 30, 2009, at 8:27 AM, Markus wrote:

>>>
>>> It could be even simpler, I think.
>>>
>>> * Switch to issuing a POST everywhere that we presently issue a GET,
>>> with a fallback to GET if the POST is rejected (405) or even better
>>> based on the api version for mixed system backwards compatibility.
>>>
>>> * Switch to accepting both POST and get where we now accept only
>>> GET.
>>
>>
>> I'm comfortable with this, and I like the idea of the server being
>> agnostic about post/get.
>>
>> It only gets us halfway, though - we need a POST to return the
>> results
>> of another query entirely (i.e., the POST of the Facts should return
>> the results of a GET of a Catalog). How would we do that?
>
> I'm envisioning something much simpler. We presently map requests
> on to
> HTTP GETs; instead, we should map them onto HTTP POSTs. That means
> we've got unlimited payload size in either direction, but otherwise
> nothing changes.

Again, that doesn't solve the real problem - I want a Facts post to
return a Catalog. How, as the client, would I tell the server that?

Or do we just always return a catalog when someone posts facts? That
seems a bit overkill.

> -- Markus
>
> P.S. If you want to be real silly and slavishly adhere to the pretense
> that the RPC -> HTTP mapping is valid, meaningful, and the names
> line up
> perfectly you could respond to the POST with a redirection header.


Yeah, that wouldn't solve the two request/one request problem.

--
While one person hesitates because he feels inferior, the other is
busy making mistakes and becoming superior. -- Henry C. Link

Luke Kanies

unread,
Nov 30, 2009, 1:39:36 PM11/30/09
to puppe...@googlegroups.com
Yeah, there's a lot of interesting stuff we can do here, but my main
interest at this point is getting the architecture to the point where
we can actually do these interesting things, rather than starting with
them out of the gate.

--
You don't learn anything the second time you're kicked by a mule.
-- Anonymous Texan

Luke Kanies

unread,
Nov 30, 2009, 1:40:57 PM11/30/09
to puppe...@googlegroups.com
The whole concept of these resources (e.g., catalogs and facts) having
canonical, network-wide URLs is completely missing from the system
right now, mostly because I don't see a huge benefit to it yet. I
don't see this as a great place to start.

--
A citizen of America will cross the ocean to fight for democracy, but
won't cross the street to vote in a national election.
--Bill Vaughan

Luke Kanies

unread,
Nov 30, 2009, 1:42:03 PM11/30/09
to puppe...@googlegroups.com
It's pretty clear that the solution here is to switch from GET to POST
and to change the internals of compiling to support receiving the
facts as the payload to a post. That solves the current problems
while leaving us open to doing all kinds of other interesting load
balancing things in the future.

--
Somewhere on this globe, every ten seconds, there is a woman giving
birth to a child. She must be found and stopped. -- Sam Levenson

Brice Figureau

unread,
Nov 30, 2009, 2:08:47 PM11/30/09
to puppe...@googlegroups.com
On 30/11/09 19:33, Luke Kanies wrote:
> On Nov 30, 2009, at 8:27 AM, Markus wrote:
>
>>>>
>>>> It could be even simpler, I think.
>>>>
>>>> * Switch to issuing a POST everywhere that we presently issue a GET,
>>>> with a fallback to GET if the POST is rejected (405) or even better
>>>> based on the api version for mixed system backwards compatibility.
>>>>
>>>> * Switch to accepting both POST and get where we now accept only
>>>> GET.
>>>
>>>
>>> I'm comfortable with this, and I like the idea of the server being
>>> agnostic about post/get.
>>>
>>> It only gets us halfway, though - we need a POST to return the
>>> results
>>> of another query entirely (i.e., the POST of the Facts should return
>>> the results of a GET of a Catalog). How would we do that?
>>
>> I'm envisioning something much simpler. We presently map requests
>> on to
>> HTTP GETs; instead, we should map them onto HTTP POSTs. That means
>> we've got unlimited payload size in either direction, but otherwise
>> nothing changes.
>
> Again, that doesn't solve the real problem - I want a Facts post to
> return a Catalog. How, as the client, would I tell the server that?
>
> Or do we just always return a catalog when someone posts facts? That
> seems a bit overkill.

Call me stupid, but I don't see why you can't POST the facts to the
catalog indirector.

Each time you need a catalog you must provide the more current facts you
can, failure to do so (ie using GET) will use the cached facts from the
<insert here your favorite cache method including memcache/node mgr or
whatever>.

Luke Kanies

unread,
Nov 30, 2009, 2:12:33 PM11/30/09
to puppe...@googlegroups.com
Yeah, after I sent this I realized that this is what Markus meant. No
idea why it took me so long to figure that out.

There still needs to be some special-case code in the catalog
compiling that looks for that payload, I think. Or should we just
automatically treat any payload as a thing to be saved? That's
probably reasonable.

--
A lot of people mistake a short memory for a clear conscience.
-- Doug Larson

Luke Kanies

unread,
Dec 1, 2009, 2:34:08 AM12/1/09
to puppe...@googlegroups.com
On Nov 30, 2009, at 2:32 AM, Christian Hofstaedtler wrote:

>> With the 'retry' functionality in ruby, the caller never knows of a
>> problem unless none of the servers work. You pick a new server,
>> reconnect, and keep going. I can't think of anything that would
>> reasonably restart the whole run itself.
>
> I had a simpler mechanism in mind:
>
> Do the server selection only once per run. See if it "works" or
> outright fails, and then stick to this server. ***
>
> If the server fails in the middle of the current run, you get a
> failed run, and the next one will (hopefully) work again.
>
> This way you also don't need to worry _that much_ that all manifests
> and files are completely in sync across all servers _all the time_ -
> which is another problem not easily solved.
>
>
> *** If there is an "intelligent" node manager in between, the server
> selection is done twice, in this order:
>
> * client picks an initial server to talk to (FT)
> * client asks this server (which is really the node manager), whom
> to talk to (LB)
> * client sends facts to this server (and sticks to it 'til the
> end/failure of the run)


I actually think that'd be more complicated, but then, I haven't
gotten around to trying to write it yet.

--
I never did give anybody hell. I just told the truth, and they thought
it was hell. -- Harry S Truman

Thomas Bellman

unread,
Dec 1, 2009, 8:21:59 AM12/1/09
to puppe...@googlegroups.com
Luke Kanies wrote:

> Again, that doesn't solve the real problem - I want a Facts post to
> return a Catalog. How, as the client, would I tell the server that?
>
> Or do we just always return a catalog when someone posts facts? That
> seems a bit overkill.

I get the impression that you need to take a step back and look at things
from a slightly higher level. Is uploading facts really an operation
that we want a client to do? If so, for what reason? Isn't the operation
we want really "get catalog based on the following facts"? Then we will
simply say that a client SHOULD do that by doing

POST /myenvironment/catalog HTTP/1.0

<facts>...

but MAY do it using

GET /myenvironment/catalog?<facts>... HTTP/1.0

(I don't remember the exact syntax for posting a form in HTTP off hand.)
In the server both would be handled identically after decoding the
GET/POST and the "form data". I haven't looked at that code in Puppet,
but any reasonable framework for handling HTTP requests should make it
pretty easy to handle them identically.



/Bellman

Russ Allbery

unread,
Dec 1, 2009, 5:50:40 PM12/1/09
to puppe...@googlegroups.com
Thomas Bellman <bel...@nsc.liu.se> writes:

> I get the impression that you need to take a step back and look at things
> from a slightly higher level. Is uploading facts really an operation
> that we want a client to do?

Actually, yes, we want all of our Puppet clients to upload all of their
facts.

> If so, for what reason?

Because we use the fact data for configuration management and inventory.

--
Russ Allbery (r...@stanford.edu) <http://www.eyrie.org/~eagle/>
Reply all
Reply to author
Forward
0 new messages