Data

Martin Simons

unread,

Dec 27, 2023, 4:56:30 PM12/27/23

to help-cfengine

Dear CFEngineer,

Fosdem and Configmanagement Camp are due in about one month.

I will do a talk on external data for nodes, CFEngine nodes too, in a landscape.

The idea is that a node sends a message to Data and then receives a tailor made set of data it can use for its operation. In principple Data serves any node that is able to send, receive and process a message.

CFEngine nodes invoke a Python script in the bginning of the process.

The nodes bother Data a bit too much, because they send the message nine times!

Up to now I use the following line to obtain the data:

"response" string => execresult("/var/cfengine/bin/cf-message.py '$(node_hard_classes_feed)'","noshell");

The requirements are:

1. The data should become available at the beginning of the run

2. The data has to be stored in the json format

3. The data has to become available in a variable, not on disk

Is there a way to avoid the agent to send the message nine times?

Regards,

Martin.

Nick Anderson

unread,

Dec 27, 2023, 7:05:47 PM12/27/23

to Martin Simons, help-cfengine

During update policy use a commands promise that populates the json. Then you can use the read json function or similar during the regular policy run.

--
You received this message because you are subscribed to the Google Groups "help-cfengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to help-cfengin...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/help-cfengine/66ba6398-d0bd-493a-9e32-9649d5dce0dfn%40googlegroups.com.

Nicolas Charles

unread,

Dec 28, 2023, 3:41:51 AM12/28/23

to Nick Anderson, 'Nick Anderson' via help-cfengine, Martin Simons, help-cfengine

What I usually do is guarding the execresult with a class defined at runtime. It prevents cf-promises from evaluating it and allows unitary execution

--
Envoyé de mon appareil Android avec Courriel K-9 Mail. Veuillez excuser ma brièveté.

Mike Weilgart

unread,

Dec 29, 2023, 1:23:28 AM12/29/23

to help-cfengine

I would go with Nick's approach also. Fundamentally, you have to cache the answer to avoid pulling it multiple times. You say the data has to be available at the beginning of the run. The natural way to do that is put the fetching of the data into the update policy.

You should be explicit about some failure cases, since you're talking about an external data dependency for some sort of classification data: What should happen if the data isn't available? Is your policy designed to have sane fallbacks if the response isn't received? Should your policy use the last cached response? Is there some age of the cache that should be considered invalid? Depending on the answers to these, you might make the update policy cache the result locally and save it for reuse, or cache the result locally but delete the local cache if the data node isn't available/if the cache is too old.

From the fact that you're currently not caching the answer at all, a conservative approach would be cache it only for a single agent run. In other words, save the json to a file from the update policy, then read it in during regular policy, then *delete* the json cache as part of the regular policy run. But the failure cases are worth thinking about and being explicit about; if the response is usually static for a given node it may be better/simpler to just update the cache "best effort" for every run of update.cf, but use the last response if the update fails (just like CFEngine policy updates themselves).

Having written all that...I am actually curious what sort of logic is the Data node doing? And why does that logic have to live external to the individual nodes and external to CFEngine policy? :)