List of substrings from another list

56 views
Skip to first unread message

Marco Marongiu

unread,
Feb 12, 2016, 10:50:40 AM2/12/16
to help-c...@googlegroups.com
Hi there.

I have a list of strings, e.g.:

> "ntp_servers"
> slist => {
> "ntp1.conectiv.com",
> "208.90.144.72 # ntp0.jrc.us",
> "bonehed.lcs.mit.edu minpoll 9 # server seems to have a strong rate limiting",
> "timekeeper.isi.edu"
> } ;
>

I want to build a new list out of it, say "servers", where each element
of servers will be the substring of the corresponding element if
ntp_servers from the beginning of the string and up to the first space,
something like:

> "servers"
> slist => {
> "ntp1.conectiv.com",
> "208.90.144.72",
> "bonehed.lcs.mit.edu",
> "timekeeper.isi.edu"
> } ;
>

Do you have any suggestion? I tried in several ways and CFEngine always
showed me the middle finger...

Not finding a way to do it, not even indirectly, I tried to get there
with a few intermediate steps. I started with building an array of lists
using string_split:

> "split_server[$(ntp_servers)]"
> slist => { string_split("$(ntp_servers)","\s+","2") } ;
>

However, any attempt to reliably retrieve
$(split_server[$(ntp_servers)])[0] resulted in pain (maybe because of
spaces embedded in the key? that gave me problems since 3.4). I either
get the string as half-expanded, e.g.:

> R: DEBUG: server: $(split_server[ntp1.conectiv.com][0])

in the case where string_split just returned one element, or an empty
string where it returned more:

> R: DEBUG: server: $(split_server[ntp1.conectiv.com][0])

Not being sure if I wasn't getting the data structure I expected, I
transformed the array in a data container and then extracted a string
representation for it:

> "data_split"
> data => mergedata("split_server") ;
>
> "data_string"
> string => storejson("data_split") ;
>

and the string representation was correct:

> R: DEBUG: data_string: {
> "208.90.144.72 # ntp0.jrc.us": [
> "208.90.144.72",
> "# ntp0.jrc.us"
> ],
> "timekeeper.isi.edu": [
> "timekeeper.isi.edu"
> ],
> "bonehed.lcs.mit.edu minpoll 9 # server seems to have a strong rate limiting": [
> "bonehed.lcs.mit.edu",
> "minpoll 9 # server seems to have a strong rate limiting"
> ],
> "ntp1.conectiv.com": [
> "ntp1.conectiv.com"
> ]
> }

However, a reports promise containing $(data_split[$(ntp_servers)][0])
never printed anything, while an assignment like this:

> "server[$(ntp_servers)]"
> string => "$(data_split[$(ntp_servers)][0])" ;
>

resulted in a segmentation fault (return code 139)

Nothing of this is fun...

Ciao
-- bronto

Ted Zlatanov

unread,
Feb 12, 2016, 3:31:57 PM2/12/16
to help-c...@googlegroups.com
On Fri, 12 Feb 2016 16:50:37 +0100 Marco Marongiu <bront...@gmail.com> wrote:

MM> I have a list of strings, [] I want to build a new list
MM> where each element [] will be the substring of the
MM> corresponding element [] from the beginning of the
MM> string and up to the first space

The below gives two solutions: first, we convert the slist into a data
container and use string_split() on its elements; and second, we join
the slist elements into a single string and use parsestringarray()
(noting that the single string must not exceed 4K and the slist elements
must be OK as classic array keys).

The proposed jq integration in
https://github.com/cfengine/core/pull/2319 will make this kind of work
trivial, but it's much more powerful than just list transformations.

Finally, the new regex_replace() function in 3.9 simplifies an aspect of
this specific case, and if you'll look at
https://dev.cfengine.com/issues/7346#note-14 I proposed a list version
of regex_replace() that would completely solve your problem. The only
reason I didn't create a new ticket is that I wasn't sure people would
find it useful. So if you'd like the list version of regex_replace(),
look at my comment and open a new ticket if you agree with it.

#+begin_src cfengine3
bundle agent main
{
methods:
"test";

vars:
"test_state" data => bundlestate(test);
"test_string" string => storejson(test_state);

reports:
"$(this.bundle): state of things = $(test_string)";
}

bundle agent test
{
vars:
"ntp_servers_slist"
slist => {
"ntp1.conectiv.com",
"208.90.144.72 # ntp0.jrc.us",
"bonehed.lcs.mit.edu minpoll 9 # server seems to have a strong rate limiting",
"timekeeper.isi.edu"
};

"ntp_servers" data => mergedata(ntp_servers_slist);

"idx" slist => getindices(ntp_servers);
"collect_full_$(idx)" slist => string_split(nth("ntp_servers", $(idx)), " ", 2);
"collect[$(idx)]" string => nth("collect_full_$(idx)", 0);

"joined" string => join($(const.n), ntp_servers_slist);
"dim" int => parsestringarray(collect2, $(joined), "\s*#[^\n]*", " ", inf, inf);
"collect2_servers" slist => getindices(collect2);
}

#+end_src

Output:

#+begin_src text
% cf-agent -KI -f ~/sync/cf/test/test_collect_substring.cf
R: main: state of things = {
"collect2[208.90.144.72][0]": "208.90.144.72",
"collect2[bonehed.lcs.mit.edu][0]": "bonehed.lcs.mit.edu",
"collect2[bonehed.lcs.mit.edu][1]": "minpoll",
"collect2[bonehed.lcs.mit.edu][2]": "9",
"collect2[ntp1.conectiv.com][0]": "ntp1.conectiv.com",
"collect2[timekeeper.isi.edu][0]": "timekeeper.isi.edu",
"collect2_servers": [
"bonehed.lcs.mit.edu",
"timekeeper.isi.edu",
"208.90.144.72",
"ntp1.conectiv.com"
],
"collect[0]": "ntp1.conectiv.com",
"collect[1]": "208.90.144.72",
"collect[2]": "bonehed.lcs.mit.edu",
"collect[3]": "timekeeper.isi.edu",
"collect_full_0": [
"ntp1.conectiv.com"
],
"collect_full_1": [
"208.90.144.72",
"# ntp0.jrc.us"
],
"collect_full_2": [
"bonehed.lcs.mit.edu",
"minpoll 9 # server seems to have a strong rate limiting"
],
"collect_full_3": [
"timekeeper.isi.edu"
],
"dim": "4",
"idx": [
"0",
"1",
"2",
"3"
],
"joined": "ntp1.conectiv.com\n208.90.144.72 # ntp0.jrc.us\nbonehed.lcs.mit.edu minpoll 9 # server seems to have a strong rate limiting\ntimekeeper.isi.edu",
"ntp_servers": [
"ntp1.conectiv.com",
"208.90.144.72 # ntp0.jrc.us",
"bonehed.lcs.mit.edu minpoll 9 # server seems to have a strong rate limiting",
"timekeeper.isi.edu"
],
"ntp_servers_slist": [
"ntp1.conectiv.com",
"208.90.144.72 # ntp0.jrc.us",
"bonehed.lcs.mit.edu minpoll 9 # server seems to have a strong rate limiting",
"timekeeper.isi.edu"
]
}
#+end_src

MM> However, a reports promise containing $(data_split[$(ntp_servers)][0])
MM> never printed anything, while an assignment like this:

>> "server[$(ntp_servers)]"
>> string => "$(data_split[$(ntp_servers)][0])" ;
>>

MM> resulted in a segmentation fault (return code 139)

Can you please submit a bug on this?

Thanks
Ted

Marco Marongiu

unread,
Feb 13, 2016, 4:01:11 PM2/13/16
to help-c...@googlegroups.com
On 12/02/16 21:31, Ted Zlatanov wrote:
> The below gives two solutions: first, we convert the slist into a data
> container and use string_split() on its elements; and second, we join
> the slist elements into a single string and use parsestringarray()
> (noting that the single string must not exceed 4K and the slist elements
> must be OK as classic array keys).

Thanks Ted, I shall check that!


> The proposed jq integration in
> https://github.com/cfengine/core/pull/2319 will make this kind of work
> trivial, but it's much more powerful than just list transformations.

+1


> Finally, the new regex_replace() function in 3.9 simplifies an aspect of
> this specific case, and if you'll look at
> https://dev.cfengine.com/issues/7346#note-14 I proposed a list version
> of regex_replace() that would completely solve your problem. The only
> reason I didn't create a new ticket is that I wasn't sure people would
> find it useful. So if you'd like the list version of regex_replace(),
> look at my comment and open a new ticket if you agree with it.

I did it, but the ticket is closed...


> MM> However, a reports promise containing $(data_split[$(ntp_servers)][0])
> MM> never printed anything, while an assignment like this:
>
>>> "server[$(ntp_servers)]"
>>> string => "$(data_split[$(ntp_servers)][0])" ;
>>>
>
> MM> resulted in a segmentation fault (return code 139)
>
> Can you please submit a bug on this?

Uhm... OK will do :)


Thanks again, ciao!
-- bronto

Marco Marongiu

unread,
Feb 18, 2016, 4:12:07 PM2/18/16
to help-c...@googlegroups.com
On 13/02/16 22:01, Marco Marongiu wrote:
>> > Can you please submit a bug on this?
> Uhm... OK will do :)

Done: http://dev.cfengine.com/issues/7952

Ciao
-- bronto

Alex Georgopoulos

unread,
Feb 19, 2016, 2:26:17 PM2/19/16
to help-cfengine
This segfaults on 3.8.1 as well.  

Just out of curiosity.  This policy you have here seems kinda strange.  I'm assuming you are buildint an ntp configuration  Is there a reason you cannot just put the data into a json from the start?  seems awfully roundabout way of doing this.   Why not have a this in your policy?  Mustache is nice as if it doesn't put variable in templates if they don't expand.

{ "ntp_servers" : [ { "server" : "ntp1.conectiv.com" },
      { "comment" : "ntp0.jrc.us",
        "server" : "208.90.144.72"
      },
      { "comment" : "server seems to have a strong rate limiting",
        "options" : "minpoll 9 ",
        "server" : "bonehed.lcs.mit.edu"
      },
      { "server" : "timekeeper.isi.edu" }
    ] }



Marco Marongiu

unread,
Feb 19, 2016, 3:07:41 PM2/19/16
to help-c...@googlegroups.com
On 19/02/16 20:26, Alex Georgopoulos wrote:
> Just out of curiosity. This policy you have here seems kinda strange.
> I'm assuming you are buildint an ntp configuration Is there a reason
> you cannot just put the data into a json from the start?

Because this is not the start :-) It's a policy that I initially built
on 3.3.x and then evolved in 3.4 to use templates. I am assessing if I
can go forward a bit more with the same policy and data, or if I have to
give up and rewrite at least the data.


> seems awfully roundabout way of doing this.

It is, for the reasons above.


> Why not have a this in your policy? Mustache is nice as if it doesn't
> put variable in templates if they don't expand.

Interesting feature, thanks :-)

Ciao!
-- bronto

Alex Georgopoulos

unread,
Feb 19, 2016, 3:14:52 PM2/19/16
to help-cfengine
That policy in the original example wouldn't work on 3.4  thus my curiosity.  Still not quite sure why you need to slice and dice the slist.  The data in the slist would be fine in a simple array expansion in an ntp configuration.

Marco Marongiu

unread,
Feb 19, 2016, 3:19:09 PM2/19/16
to Alex Georgopoulos, help-cfengine

The policy I posted was just a sample/test, mimicking the data I have in the real policy. I need to extract the server names/addresses to automatically build access rules that will allow clients to synchronise themselves but not to serve time to other clients.

Ciao!

--
You received this message because you are subscribed to the Google Groups "help-cfengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to help-cfengin...@googlegroups.com.
To post to this group, send email to help-c...@googlegroups.com.
Visit this group at https://groups.google.com/group/help-cfengine.
For more options, visit https://groups.google.com/d/optout.

Marco Marongiu

unread,
Mar 2, 2016, 6:08:01 AM3/2/16
to help-c...@googlegroups.com
I finally got time to get back to this. The policy listed below works.
Two things are clear to me at this point:

* we need to look into our old 3.3.x policy and make it more modern;
* the chicanery we have to do to achieve such a simple data
transformation is a pretty clear sign that we need those functions Ted
has proposed :-)

Thanks Ted!

Ciao
-- bronto



Marco Marongiu

unread,
Mar 2, 2016, 6:14:36 AM3/2/16
to help-c...@googlegroups.com
On 02/03/16 12:07, Marco Marongiu wrote:
> I finally got time to get back to this. The policy listed below works.

Oh see, I did NOT include the policy :-)

Here we go, sorry!




t.cf

Ted Zlatanov

unread,
Mar 3, 2016, 11:05:05 AM3/3/16
to help-c...@googlegroups.com
On Wed, 2 Mar 2016 12:07:57 +0100 Marco Marongiu <bront...@gmail.com> wrote:

MM> I finally got time to get back to this. The policy listed below works.
MM> Two things are clear to me at this point:

MM> * we need to look into our old 3.3.x policy and make it more modern;
MM> * the chicanery we have to do to achieve such a simple data
MM> transformation is a pretty clear sign that we need those functions Ted
MM> has proposed :-)

I think you'll find the recently created
https://github.com/cfengine/core/pull/2514 (allow nesting slist and data
container functions) extremely useful for the data transformations you
may need.

Ted

Reply all
Reply to author
Forward
0 new messages